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Introduction 


Ten years and more have passed since the publica- 
tion of Operant Behavior, which was the first effort 
to provide a reasonably comprehensive account of 
those areas of thought and research in psychology 
which were influenced substantially by operant meth- 
ods. The time has come for a reassessment of several of 
those areas, for a description of other topics involving 
operant methods or bearing upon them, and for a 
conceptual examination of the fundamental principles 
of the Experimental Analysis of Behavior and _ its 
relationship to other parts of experimental psychol- 
ogy. The present Handbook cannot pretend to accom- 
plish all of these aims, or even to do justice to any, 
but it provides much relevant empirical material, and 
many discussions which are both incisive and enlight- 
ening. 

Certain aspects of operant behavior were deliber- 
ately excluded from the outset. The enormous 1n- 
crease in the use of operant methods for both funda- 
mental and applied research makes it impossible to 
cover these major areas in one volume. This book is 
devoted entirely to topics in experimental psychology. 
We have welcomed the planning and publication of 
two companion books: One is the Handbook of Ap- 
plied Operant Behavior, edited by Harold Leitenberg, 


Werner K. Honig 


and 
J. E.R. Staddon 


and in press with Prentice-Hall. The other is entitled 
Social and Instrumental Processes: Foundations and 
Applications of Behavioral Analysis. It 1s edited by 
T. A. Brigham and A. C. Catania, and will be pub- 
lished by Irvington. 

Our own book provides a mixture of experimental 
and theoretical material which reflects the current 
status of operant behavior. No chapter is “strictly 
experimental” in the sense that it fails to raise con- 
ceptual and theoretical issues, or concentrates entirely 
upon methodology. Only a few chapters—those on 
language—are largely theoretical, although empirical 
studies do, of course, provide some of the material for 
discussion. Perhaps we would be wisest to let the 
chapters follow without further comment, but after 
giving them many hours of scrutiny, we succumb to 
the temptation of providing the reader with a few 
general impressions. 

First, it is becoming quite clear that operant 
methods and principles are becoming increasingly in- 
tegrated in general experimental psychology. At first, 
the operant movement (if such it should be called) 
was quite isolated, largely due to negative reactions 
from its critics and enemies, who were put off by 
Skinner’s radical behaviorism, by the artificial and 


restricted environment of the “Skinner box”, by the 
lack of concern with theoretical issues, or by the ap- 
parent threat to traditional freedoms and values posed 
by the prospective control of human behavior through 
operant methods. Furthermore, operant methods facil- 
itated new research strategies with little regard for 
traditional principles of experimental design. The in- 
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of treatments was indeed promoted as a kind of model 
of experimental method (Sidman, 1960), But the isola- 
tion felt by workers in the area of operant behavior 
was partly selfamposed. One docs not gain a sense 
of compromise from Skinner’s writings. The im pres- 
sion conyeyed is that the study of opcrant behavior 
should not be contaminated by attention to tradi- 
tional problems, methods, and theoretical issues. The 
movement was named The Experimental Analysis of 
Bellavior, suggesting that the eperant method pro- 
vides the only yalid and constructive approach to the 
systematic study of bchavior. 

But the interest in operant behavior has not de- 
chned, and the use of operant methods is certainly 
ne longer restrictcd to Skinner’s students and associ- 
ates. The advantages of these methods, reviewed in 
thé prétacé to the previous volume (Honig, 1966), are 
se clear that many psychologists used them to study 
problems outside the original purview of the Experi- 
mcntal Analysis of Behavior. As interest in the con- 
SIruCtION Of and debaté svar orand theorétical systems 
déélined, experimental] psychologists concentrated on 
the explanation of more limited aspects of behavior, 
and these could be studicd systematically through 
opcrant methods in a tractable experimental setting 
that provided greater flexibility than its reputation 
had suggested. Younger psycholdgists, no longer ab- 
corbéd by débates among “schools” of psychology, felt 
less inclined to excludes operant behavior from their 
scope of interest, or, conversely, to restrict their atten: 
tion to the limited range of problems addressed by 
Skinner. While the “passing of the pressing of the 
bar” never did come about, Skinner's methods and 
principles have not dominated experimental method- 
ology, nor have they supplanted all other means by 
which orderly data can be obtained. The operant is 
still a very viable unit, as demonstrated in the many 
pages of this text, but it can no longer be so clearly 
separated from other modes of behavior. The relation 
of operants to the latter has come under close exami- 
nation over the last ten years. Some of the conclusions 
are worth reviewing. 

Operant behavior is studied by arranging for the 
animal to affect its environment in some way—by 
pressing a lever, pecking a lighted key, or breaking a 
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photobeam. This effect, which can be accomplished 
in any way the animal chooses, is termed the response. 
If it changes the environment in a way that has moti- 
vating consequences, giving access to food or water, 
or escaping electric shock, the animal will generally 
learn to make the response more frequently. This 
change defines the consequence as a reinforcer. The 
prescribed relation between responding and reinforce- 
ment Is a response-reinforcer contingency. The fre- 
quency of the response will be strongly affected by 
stimuli that signal availability or unavailability of the 
reinforcer (discriminative stimuli). 

The rule or rules prescribing the relations among 
stimuli, responses, and reinforcers is a reinforcement 
schedule. Schedules are of interest both in their own 
right, and as uscful “contrivances”, in Jenkins’ phrase, 
that can be used to tease apart the mechanisms that 
underlie learned behavior. Much of this book has as 
its experimental basis the very extensive work on 
schedules that has taken place in the past fifteen 
years, One approach is to treat reinforcement sched- 
ulcs as an opportunity for a sort of experimental 
ecology, as a way to set up a novel set of relations 
between an animal’s behavior and its consequences, 
and then to observe how the subject copes with this 
new situation. By studying a range of situations a 
taxonomy may be dérivéd and general principles in- 
duced in Baconian fashion, Although much data of 
this sort has been gathered and is reviewed by 
é£ciler in this book, Skinner’s strictures against theoriz- 
ing and “botanizing’” have discouraged both syste- 
matic éxplorati6n of non-schedule variables, such as 
species differences and type of reinforcer, and persis- 
tent attempts to make theoretical inferences. Zeiler 
describes schedule control based on the delivery of 
primary reinforcers in Chapter 8, while Gollub re- 
views the parallel role of conditioned reinforcers in 
Chapter 10. These writers analyse in detail the con- 
trolling variables, involving both response-contingent 
and non-contingent delivery of stimuli; their treat- 
ments verge on theoretical accounts of the temporal 
patterning of behavior. 

Schedules in the broader sense are also used as 
analytic devices. A particular set of relations between 
responding, stimuli, and the reinforcer is used for the 
study of particular empirical or theoretical questions. 
The elegant demonstrations of autoshaping (Brown 
and Jenkins, 1968; Williams and Williams, 1969) 
demonstrated that key-pecking in the pigeon can be 
generated and maintained through “classical” con- 
tingencies. Likewise, Reynolds’ (1961) experiments 
on behavioral contrast are among many other ex- 
amples of the power of operant techniques to reveal 
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properties of learned behavior that are indirectly 
determined by effective schedules. Chapter 3 by 
Schwartz and Gamzu brings together these lines of 
work. When it is shown that schedules controlling 
instrumental behavior generate and maintain classi- 
cally conditioned responses, the whole relationship 
between these two classes of behavior needs to be re- 
examined. ‘Teitelbaum undertakes this task in Chapter 
1; he shows how physiological techniques are used to 
identify the motivational substrates that underly in- 
strumental behavior in the intact animal. 

While the use of operant techniques continues to 
prosper, the conceptual framework that has grown 
up around them has begun to show signs of strain in 
recent years. The terms “response”, “reinforcer”, and 
“stimulus” imply classes of events that are similar 
in their essential properties and can be combined in 
arbitrary ways. One reinforcer is, if not the same 
as another, at least not qualitatively different. All re- 
sponses (at least all operant, as opposed to respondent, 
responses) are more or less equally reinforcible by all 
reinforcers and can with equal facility come under the 
control of any stimulus. This story is a familiar one 
and recent discussions of “constraints on learning” 
(Hinde and Stevenson-Hinde, 1973) have made its 
imperfections well known. The important point with 
respect to constraints is that the selection of one par- 
ticular stimulus, response, or reinforcer may well limit 
the selection of others that will be effective in con- 
junction with it. Some stimuli may control behavior 
more readily in avoidance paradigms than in con- 
junction with positive reinforcement, and the con- 
verse can also be demonstrated. ‘This sort of constraint 
need not invalidate, although it may extend, princi- 
ples obtained with the use of the most appropriate 
experimental elements. In most situations the func- 
tional elements “‘stimulus”, “response”, and “‘rein- 
forcer” can be identified and the relations among 
them are more or less what we have learned to expect 
from studies of bar pressing or key pecking. So while 
the “arbitrary response’ is no longer with us, the 
concept of reinforcement contingency has, if any- 
thing, gained in scope. 

Instrumental responses are closely related to the 
species-specific consummatory behavior which is con- 
tingent upon them, and this relationship underlies 
some of the constraints just mentioned. In Chapter 
4, Dunham reviews some of the ‘“misbehaviors of 
organisms.” ‘hese observations have provided us with 
a general principle, namely that instrumental re- 
sponses often approximate consummatory behaviors, 
and may be easiest to teach when they do so. Further- 
more, it 1s likely that if instrumental behavior is sep- 
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arated from species-specific patterns, greater flexibility 
may be observed. Responses which, let us say, control 
the duration of stimuli differentially correlated with 
reward, may be more tractable, and thus can be more 
“arbitrary” than those that procure the reward itself. 
When consummatory behavior is preceded (and ‘‘pre- 
dicted”) by a signal it often emerges as a classically 
conditioned, or “autoshaped”’, response to that signal. 

The significance of autoshaping for the area of oper- 
ant behavior is threefold. First, the status of the arbi- 
trary operant was reduced when it was discovered that 
such cherished instrumental behaviors as the pigeon’s 
key peck could readily be conditioned through clas- 
sical means. Second, it suggested that both the form 
and the quantity of operant behavior could be in- 
fluenced through classical (stimulus) contingencies, as 
a current analysis of contrast effects suggests. Schwartz 
and Gamzu trace this relationship in their chapter. 
Third, the “instinctive drift’? which underlies the 
misbehavior of organisms can be explained through 
the operation of classical conditioning principles: An 
instrumental response reliably precedes the consum- 
matory behavior occasioned by the presentation of the 
reinforcer. ‘hus, depending on the schedule, the in- 
strumental response is a more or less reliable predictor 
of the consummatory response. In accordance with 
classical conditioning principles the instrumental re- 
sponse may therefore come to act as a conditioned 
stimulus, eliciting components of the unconditioned 
response. ‘Io the extent that instrumental and un- 
conditioned responses are incompatible, interference 
may result and instrumental responding may be sup- 
pressed, as the Brelands found. On the other hand, 
if the instrumental response is judiciously chosen to 
be compatible with the consummatory response, facili- 
tation will be the rule. However, the classically condi- 
tioned nature of the response can be revealed by 
special scheduling arrangements such as the Williams’s 
omission procedure. The chapters by Schwartz and 
Gamzu, Dunham, and Staddon deal with these mat- 
ters. 

A less direct, but equally fruitful, approach to the 
relationship between instrumental and consummatory 
responses is provided by the presentation of “free” 
reinforcers on a temporally defined schedule. Work of 
this nature indicates that terminal behaviors approxi- 
mating the consummatory response occur shortly be- 
fore the presentation of the reinforcer, while other 
interim behaviors occur when the likelihood of rein- 
forcement is low. This method is but one example of 
a significant change in operant experiments, namely 
the simultaneous observation and recording of various 
responses in addition to the instrumental behavior. 
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Such observation, facilitated by closed-circuit televi- 
sion, is, of course, a change from the traditional em- 
phasis upon a single “externalized” response, but it 
has enormously enriched and broadened the analysis 
of behavior in a controlled environment. It also pro- 
vides a bridge to the ethological study of animal behav- 
ior, a field whose avowed interests sometimes appear 
very different from those of operant psychology (al- 
though the reality of the difference is often arguable), 
but whose methods are quite similar. Thus, while the 
pressing of the bar has not passed, it has been supple- 
mented by other concurrent observations. Staddon 
reviews this work in Chapter 5, and Hutchinson in 
Chapter 14 describes related experimental results in 
situations involving electric shock. 

Just as the concept of the response has undergone 
a searching analysis which is reflected in this book, the 
process of reinforcement has also been re-evaluated, 
and in several very different ways. Premack’s theory 
(Premack, 1965), which was being developed while 
Operant Behavior was being written, has left its mark. 
The Experimental Analysis of Behavior is well suited 
to the notion that reinforcers have no absolute quali- 
ties, but are functionally defined, and situationally 
determined. The development of these ideas is traced 
by Dunham in Chapter 4. Morse and Kelleher, in 
Chapter 7, take a yet more radical view, suggesting 
that reinforcement and punishment are often the 
outcome of particular scheduling contingencies, and 
their functional analysis is not necessarily bound up 
with the presumed noxious or appetitive qualities. 
Their careful argument and analysis cannot be sum- 
marized in a few words, but one of their contributions 
should be pointed out: Once and for all, they separate 
the presumed appetitive or aversive qualities of re- 
sponse-contingent stimuli from the identification of 
such stimuli as reinforcers or punishers in terms of 
their effects in maintaining patterns of instrumental 
responding. 

The process of reinforcement is analyzed in this 
book in two other, quite different ways. In Chapter 
6, Satinell and Hendersen describe the maintenance 
of an internal state, namely temperature, via instru- 
mental behavior. This leads quite naturally into feed- 
back theory. Here reinforcing effects are best regarded 
not in terms of some presumed strengthening effect 
but as adjustments to deviations from an internal “set 
point”. Operant behayior is but one of several mech- 
anisms, physiological as well as behavioral, that help 
maintain the stability of the milieu interne. It is inter- 
esting that behavioral thermoregulatory mechanisms 
appear to be phylogenetically older than the physio- 
logical regulatory processes that supplement behavior 
in so-called warm-blooded animals. 
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This approach is presented in an even more radical 
form in Chapter 2 by Collier, Hirsch, and Kanarek, 
who describe a situation where animals live in the 
experimental setting and can gain all of their required 
food (or water) in the form of unrestricted meals. 
When the meal, rather than the pellet, becomes the 
unit of reward, it supports a very large amount of 
behavior, in spite of the absence of the deprivation 
condition commonly thought necessary for the per- 
formance of an instrumental repertoire. These find- 
ings may pose some real problems for reinforcement 
theory, while at the same time they support the “rele- 
vance” of the Experimental Analysis of Behavior to 
human affairs, since the environment that Collier et 
al, are working with is rather naturalistic, and pro- 
vides an apt parallel for much of the human condi- 
dition. Aside from its contributions toward the 
theoretical analysis of reinforcement, this research 
broadens our concept of instrumental behavior as an 
activity rather than a response. It can occur in 
“bouts”, as do the consummatory behaviors contin- 
gent upon it. Such a view permits a conceptual re- 
evaluation of instrumental behavior as an activity that 
is chosen, from among others, for a proportion of the 
available time. 

The theoretical analysis of reinforcement has in 
this book also been extended far beyond the parallel 
efforts in 1966, especially with respect to quantifica- 
tion. Where behavior is related in an orderly fashion 
to other, controlled aspects of the environment, mathe- 
matical analysis becomes fruitful. De Villiers devotes 
Chapter 9 to an examination of quantitative versions 
of the Law of Effect, largely through a review of ex- 
periments on choice. His approach resembles Pre- 
mack’s and, in a related field, Helson’s adaptation-level 
theory, in being relativistic. While debate continues 
on the best form of theory, the notion that levels of 
instrumental responding are determined by the con- 
text of reinforcement, by relative rather than absolute 
reinforcement rates, is clearly here to stay. De Villiers 
well illustrates the trend towards integration of oper- 
ant with general experimental psychology because he 
re-analyzes results from standard discrete-trial situs 
ations in accordance with his quantitative formula- 
tions. ‘To this end, he considers running in an alley 
(for example), as an extended quantifiable response, 
which can be represented in such a way to make it 
amenable to an analysis originally based on concur- 
rent operants. Conditioned reinforcement is also sub- 
jected to a mathematical treatment in Chapter 11 by 
Fantino, as another illustration of the quantitative 
trend in the theory of operant behavior. 

Not only do we find new and very different treat- 
ments of the concepts of response and reinforcement, 
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but the role played by theoretical analysis itself seems 
quite to have changed in the last ten years, no doubt 
due in part to the reanalysis of the basic terms and 
concepts. ‘Ihe chapters on stimulus control and aver- 
sive control, in addition to those cited here already, 
attest to this. Concepts such as inhibition and atten- 
tion no longer require justification or defense, but 
rather a searching analysis of their determining char- 
acteristics, as Rilling and Mackintosh show in Chap- 
ters 15 and 16 respectively. In his careful review of 
the role of errors in the attainment of discrimina- 
tions, Rilling concludes that the correlation between 
discriminative stimuli and rewards, rather than the 
emission of unreintorced responses, determines the 
inhibitory properties of a stimulus. Again, a relation- 
ship between stimuli, rather than between a response 
and a stimulus, seems to govern processes that in turn 
control instrumental performance. Mackintosh uses 
the slope of the generalization gradient as an index of 
attention to an extent that could hardly have been 
anticipated a decade ago. His theoretical discussion 
takes into account the role of repeated instrumental 
responses as stimuli which in their own right may 
share stimulus control with other events explicitly 
programmed by the experimenter. While Blough and 
Blough, in Chapter 17, are less concerned with the- 
oretical questions in the study of animal psycho- 
physics, they give an account of signal detection theory 
as one method for the analysis of instrumental be- 
haviors used to assess the perception of, and the dis- 
crimination between, stimuli. 

Other theoretical treatments, particularly in the 
area of aversive control, reflect a more purely behav: 
loristic orientation. In Chapter 12 Blackman provides 
a‘much needed analysis of the importance of the 
operant baseline in its interaction with other funda- 
mental processes when conditioned suppression is ob- 
tained. This topic was not included in the predecessor 
of this volume. Hineline elaborates on avoidance in 
Chapter 13; he continues an analytic orientation to- 
ward free-operant avoidance that was already in pro- 
gress a decade ago. It is interesting how this area has 
changed from an emphasis on the methods which will 
produce free-operant avoidance to experiments that 
analyze the variables and processes that maintain such 
behavior. ‘The current theoretical context emphasizes 
the organism’s evaluation, as it were, of the correla- 
tions between responding and the absence of aversive 
events. 

Operant methods have also become ubiquitous in 
the study of electrical, chemical, and physiological 
determinants of behavior. Aside from some general 
discussion in ‘Teitelbaum’s chapter, three other chap- 
ters are specifically concerned with these problems. In 


these areas we see that again, a given treatment or a 
given behavioral effect may play more than one func- 
tional role in the ultimate patterning of responses. 
‘Thus, electrical brain stimulation can act as a power- 
ful reinforcer, as Mogenson and Cioé show in Chapter 
19, but it can also elicit patterns of behavior closely 
related to the reinforcing effect. Furthermore, a func- 
tional analysis of the reinforcing process involved with 
such stimuli reveals that they do not act very differ- 
ently from “standard” reinforcers, once parameters 
such as immediacy of delivery are controlled for. With 
the behavioral effects of drugs we see a converse set of 
relationships. Drugs are not limited to their tradi- 
tional actions as depressants, stimulants, and the like. 
They also can act as powerful reinforcers (or punish- 
ers), as those who are concerned with applied prob- 
lems can well testify. Thompson and Boren treat 
behavioral pharmacology in Chapter 18. The effects of 
stress on biochemical and other physiological processes 
are reviewed by Brady and Harris in Chapter 20, and 
here again we have evidence of the dual role of such 
“internal effects’, They may reflect external treat- 
ments which control behavior, but if they can be made 
accessible to the subject by being “externalized” as 
feedback stimuli, they can participate in the control 
of behavior as discriminative and reinforcing stimulli. 

An analysis of language derived from the study of 
operant behavior in animals was proposed by Skinner 
in his 1957 tour de force, Verbal Behavior. This work 
has excited much subsequent controversy but little by 
way of empirical test. Noam Chomsky (1959) in a 
famous critique roundly condemned the work as 
empirically ill-founded, quantitatively implausible, 
and little more than a restatement of the familiar in 
neologistic terms. Ghapter 21 by Robinson deals with 
one aspect of this debate. He shows how a purely 
associationistic model can lead to the development 
of a language structure. Hence the existence of struc- 
ture in language does not require either that its basis 
is innate or that people learn “rules” in the conyen- 
tional sense. It is perhaps helpful to be reminded that 
learning by association implies only that things be- 
come joined to other things through experience, a 
“mental chemistry” in Mill’s phrase, and not that the 
conjoined entities are of necessity stimuli and overt 
responses. In Chapter 22, Segal provides a clear and 
concise summary of the meat of Verbal Behavior, a 
book often cited but, we suspect, less often read. She 
stresses the parallels between Skinner’s views on lan- 
guage and contemporary structural approaches. She 
provides a framework for the conciliation of a conflict 
originally generated by these approaches, a conflict 
which was for a long time viewed as typical of the 
separation of operant analysis from more traditional 


forms of theorizing. Her achievement is perhaps sym- 
bolic of a more general rapprochement that will, in 
our opinion, gain strength over at least the next few 
years. 

In many ways, then, research based on operant 
behavior is becoming more closely integrated into 
general experimental psychology. ‘Theoretical ques- 
tions are asked of the manner in which different forms 
of operant behavior are generated and maintained. 
Yet this behavior is itself used in turn to obtain 
answers to theoretical questions of all kinds. Our brief 
overview of this book has stressed conceptual and 
theoretical developments relevant to operant behavior 
over the last ten years. We have said little about 
operant methodology itself, which has changed but 
little, Operant methods continue to be used widely as 
tools; in many ways the parallel between the operant 
chamber in psychology and the microscope in biology 
is justificd. The advantages of operant methods were 
recounted a decade ago in the introduction that corre- 
sponds to this article, ‘These advantages have not 
diminished. We hope that this handbook reaffirms the 
value of operant methods as well as the vitality of 
the empirical questions to which they continue to be 
applied. 
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INTRODUCTION 


As a practical approach to the control of behavior, 
B. F. Skinner’s operant psychology is clearly a success. 
His approach has been embodied in distinctive atti- 
tudes toward the study of learned motivated behavior, 
which in turn have generated a specialized termi- 
nology and have led to the design of automated and 
computerized equipment for detecting an individual’s 
behavior, for reinforcing it according to particular 
schedules, and for recording the way the behavior is 
shaped by the process (Ferster & Skinner, 1957). In 
psychopharmacology, these techniques have been used 
to generate stable base lines to assess the behavioral 
effects of drugs. Physiological psychologists use them 
to imterpret the effects of localized brain damage 
(Honig, 1966). In human education, teaching ma- 
chines and programmed texts are being developed to 
individualize and enhance the learning of conceptual 
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material in a variety of fields (Anderson, 1967; Skin- 
ner, 1961). In mental hospitals, therapists apply the 
principles of reinforcement by using token economies 
to shape up socially acceptable behavior patterns 
(Kazdin & Bootzin, 1972). In the therapist’s office, 
impulsive behavior is brought under control (Halmi, 
Powers, & Cunningham, 1975; Stunkard, 1974; for 
possible perils, however, see Bruch, 1974). Experi- 
mental communities, presaged in Walden Two (Skin- 
ner, 1948a), are being explored as ways of solving the 
problems of social living. New journals, both theoret- 
ical and applied, are devoted to the operant approach, 
and the number of adherents continues to grow. 
However, some aspects of the concept of the oper- 
ant have come under attack. Laboratory learning 
phenomena such as autoshaping (Brown & Jenkins, 
1968) look like operants but seem not to fit the prin- 
ciples of operant conditioning, and there are other 
examples of the “misbehavior of organisms’ (Breland 
& Breland, 1961, 1966). Ethologically oriented workers 
encounter biological constraints on learning, in which 
specialized evolutionary adaptations, either in non- 
traditional physiological systems (taste-aversion learn- 
ing) or in nonmammalian species, suggest to some 
that the search for general laws of learning may be 


premature or even unwarranted (Garcia, Hankins, & 
Rusiniak, 1974; Hinde & Stevenson-Hinde, 1973; 
Rozin & Kalat, 1971; Shettleworth, 1972, 1975). 

Such scientific paradoxes indicate that our present 
thinking may need re-evaluation (Teitelbaum, 1974). 
We must go back to the history of our ideas to dis- 
cover how to revise them. In this chapter, I shall 
therefore discuss how the operant came to be, and 
then some of the phenomena that seem to pose difh- 
culties for it. ‘These difficulties are related to similar 
problems in our thinking about all motivated be- 
havior, Finally, I will summarize new evidence from 
the physiological study of brain-damaged animals and 
people that suggests that 1t may be fruitful to look at 
the operant in terms of levels of integration. 

The operant philosophy has structured, not only 
our thinking about behavior, but also our ideas about 
how to study it. Implicit in it has béén thé rejection 
of alternative approaches, particularly physiological 
Analysis. J will try to characterize the various forms of 
analysis and synthesis used in the experimental ap- 
proach to understanding bchavior in order to sec why 
Operant analysis went one way, while physiological 
analysis took another. ‘Then I will point out that there 
arc new physiological approaches that can mect the 
objections raised by Skinner and are compatible with 
his thought and work. The time may be ripe to merge 
the operant and physiological methodologies in a con- 
eerted intellectual and experimental attack upon the 
levels of integration of the operant. Through be- 
havioral analysis of developing infants and of adults 
recovering from brain damage, we may extend our 
understanding of the operant. The result can be a 
basic behavioral approach that preserves the values of 
the operant while linking it, by a set of physiological 
principles, t6 the fields of néurslogy, physiological 
psychology, developmental psychology and ethology. 
In short, we must bridge the gap between Sherrington 
and Skinner. 


HISTORICAL BACKEROUNB 


AIl psychologists assert a common goal—the attempt 
to understand human behavior. However, apart from 
differences in topics of interest, they adopt different 
fundamental beliefs about the best way of reaching 
this goal. Such beliefs, which are characteristic of all 
the academic subdivisions of our field, have created 
different “schools,” many of which have become so 
insular that they hardly communicate with one 
another any more. For example, it is commonly felt 
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that the “operant” approach is incompatible with a 
physiological analysis of behavior. 

As Descartes (1637) pointed out, the experimental 
approach to understanding involves two intellectual 
or experimental processes: (1) breaking down a phe- 
nomenon into simpler elements (analysis) and (2) re- 
combining those parts to make sure that they are 
sufficient to reconstitute the original phenomenon 
(synthesis). It 1s in the choice of simpler elements, and 
in the methods of using them to account for behavior, 
that the various ‘‘schools” differ. For instance, physio- 
logical psychologists have long tried to simplify be- 
havior by chopping the nervous system into smaller 
chunks, This is the classic levels-of-function approach 
to the nervous system used by Flourens (1824), Sher- 
rington (1906), and many others, Such experimental 
analysis in animals yields direct evidence for simple 
subcomponents of behavior, such as spinal reflexes, 
and for their more complicated integration as postural 
and movement patterns. like those described by Mag- 
nus (1926) in decerebrate animals. Comparative psy- 
chologists, like their European counterparts, the 
etholovists, use the simpler nervous system of insects, 
birds, and fish to study reflexes and the more complex 
hormonally controlled instinctive patterns such as 
feeding, fighting, and mating. In general, they all 
agree with Descartes that to understand, one must 
simplify. 

However, these surgically or phylogenetically sim- 
plified preparations yield phenomena (reflexes, in- 
stinctive patterns) and theoretical constructs based on 
them that do not seem to help much in understanding 
the phenomena of language, thought, neurosis, and 
psychopathology that fascinate us in human behavior. 
We still do not know how to use our knowledge of 
these simple phenomena synthetically to predict or 
control very much that is significant in everyday hu- 
man life (Skinner, 1957), 

We all face this dilemma very carly in our study of 
behavior, and it is at this point that we split up. Some 
of us (the physiological types) so toward the molec- 
ular. We say that behavior reflects the action of the 
neryous system, so we must understand the latter be- 
fore anything else. Although we keep human _ be- 
havior in mind to return to eventually, we work on 
animals and concentrate on understanding molecular 
phenomena such as synaptic transmission (with pos- 
sible relevance to mechanisms of learning and mem- 
ory), and sensory physiology (how docs the nervous 
system transform a stimulus into a sensation’). Some 
use electrical and chemical stimulation and ablation 
to study brain mechanisms of motivation and rein- 
forcement. Others try to identify areas of the brain 
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concerned with learning and mémory. Because func- 
tion must depend on structure, many spend a great 
deal of time mapping functional systems neuroan- 
atomically. 

All physiological experimenters share the belief 
that the most fruitful experimental analysis is real, 


not hypothetical. By a real analysis, I mean using | 


experimental techniques to isolate physically a frac- 
tion of a more complex system, yielding a system for 
study that has fewer variables acting on it, yet which 
still preserves the essential phenomena that are of in- 
terest; hence the emphasis on developing finer elec- 
trodes to measure the activity of fewer cells, or even 
one cell at a time, and on finding ever simpler nervous 
systems (e.g., the horseshoe crab Limulus or the sea 
hare Aplysia), Because human behavior is so complhi- 
cated, most feel they cannot now make much progress 
with it, and they study molecular phenomena, firm in 
their belief that such knowledge is fundamental to 
human behavior and will eventually pay off. As a 
consequence, they deemphasize the synthetic applica- 
tion of their understanding. Many do not even try to 
extrapolate their findings to people, feeling that there 
is too great a gap between their observations on ant- 
mals and analogous phenomena in humans, in whom 
cultural and social factors loom so large. 

When faced with this dilemma—dealing with sub- 
components of behavior that are real but too simple 
to use in controlling or predicting the interesting 
aspects of human behavior—many psychologists reject 
the physiological approach entirely. ‘They pick impor- 
tant aspects of human behavior—phenomena of lan- 
guage, modes of problem solving, associative thought 
processes, social attitudes toward others, etc. They 
bring them or their analogs into the laboratory and 
try to figure out the environmental and constitutional 
variables that determine them. From each particular 
approach they deduce hypothetical variables for use 
in explanatory and predictive theories. For example, 
people have thought in terms of frequency and 
recency of associations, perceptual dispositions or 
“sets,” stimulus-response (S-R) bonds in _ habit 
strength, tendencies to increase or decrease “cognitive 
dissonance,” ego, id, or superego, etc. All are highly 
abstract, and even though they may be useful in try- 
ing to deal with real human problems, to a physiolog- 
ical psychologist they do not seem very tangible or 
relevant to known phenomena in the nervous system. 
So we drift further apart. 

In his book The Behavior of Organisms B. F. Skin- 
ner (1938) grappled with the same dilemma. He very 
carefully considered the value of simplifying behavior 
by neurosurgery, as in the work of Sherrington (1906) 
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on spinal reflexes. He realized that when such a sim- 
plification was achieved, the main scientific value 
for behavior was that for each reflex the adequate 
stimulus could be identified, and because of its close, 
virtually invariable association with the motor act, the 
laws governing the S—R correlation could be worked 
out. The variables governing any reflex can be classi- 
fied into two types: (1) environmental (the strength, 
number, and duration of stimuli and their spatial 
and temporal interaction) and (2) organismic (the 
central states—such as hunger, fatigue, and hormonal 
conditions). In spinal reflexes, the organismic states 
do not seem to affect the S-R correlation very much 
and for most purposes are largely ignored. Because 
such reflexes seem relatively uninfluenced by learning 
or the central “motivational” states that affect learn- 
ing, the phenomena most germane to human behavior 
do not appear to have any obvious similarities to 
reflexes. 

When Pavlov (1927) discovered conditioned re- 
flexes, many psychologists believed that they had a 
simple system that could reveal the laws of animal 
and human learning. If an unconditioned reflex such 
as salivation at the sight or taste of food could come 
to be elicited by any arbitrary stimulus, such as a 
flash of light or the sound of a buzzer, and if this 
association could be remembered for long periods of 
time, then perhaps the laws of learning could be 
quickly worked out. Many still have faith in this 
paradigm (e.g., Moore, 1973). ‘To Skinner, however, it 
seemed clear that much of the behavior of animals 
and people was not based on autonomic responses, 
evoked automatically by stimuli. Most of their be- 
havior seems to be emitted as an act that modifies an 
environment in which no eliciting stimulus is readily 
identifiable, rather than automatically evoked as a 
respondent like salivation at the sight of a stimulus 
paired with food. Furthermore, attempts to synthesize 
an understanding of complex behavior from the con- 
cepts of the reflex, conditioned or unconditioned, led 
to fruitless “physiologizing’’ (a tendency to push ex- 
planation back to the level of neural phenomena, 
without any proof that such hypotheses are valid), or 
to a great deal of speculation about theoretical con- 
structs involved in learned behavior. In the extreme, 
the latter can be compared to the uselessness of medi- 
eval scholasticism—i.e., how many S-R bonds can 
dance on the head of a pin? To Skinner (1950, 1972a, 
1972b), both forms of hypothetical synthesis led away 
from direct contact with real phenomena and there- 
fore did more harm than good. 

In order to evaluate Skinner’s solution to the prob- 
lem of the analysis and synthesis of behavior, we 
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should first briefly review the common experimental 
methods of scientific synthesis. After an analysis, real 
or theoretical, that has broken a phenomenon down 
into simpler parts, how do we put them back together? 
As far as I have been able to determine, there are five 
methods, four of which I have described earlier 
(Teitelbaum, 1967). In what seems an increasing order 
of abstraction (with, therefore, an increasing possibil- 
ity of error in their application), they are as follows: 


1. DirEcT SVNTHESIS 


This method is often used in chemistry. When a 
chemist wishes to determinc the nature of an un- 
known substance, he breaks it down into its com- 
ponents. If his analysis is correct, he should be able to 
synthesize the original substance by taking the indi- 
vidual components from completely different sources 
and putting them together under the appropriate 
environmental conditions. A ¢lagsie axample of this 
in physiology was carried out by the Nobel Prize 
winner Gcorgc Wald in collaboration with Ruth 
Hubbard (Hubbard & Wald, 1951). After years of 
working out the experimental analysis of rhodopsin, 
they tock the individual subcomponents from com- 
pletely different sources and put them all together. 
Purifigd opsin from the retinas of cattle, crystalline 
alcehel dehydrogenase derived from horse liver, vita- 
min A from fish liver oil, and cozymase (now Called 
DFN) from yeast when brought together in solution 
foarméd a édinpound with all the properties of natural 
rhedepsin, This is a beautiful example of the proof of 
an analysis by direct synthesis. 


2. COUNTEREXPERIMENT! SYNTHESIS 
AFTER HRA@TIGNATION 


This was the faverite methed of Glaude Bernard 
(1865), the great French physiologist. In essence, the 
principle is: when a change occurs after you remove 
something. put back a fraction of what you have reé- 
moved. If yau restoré the original state, the fraction 
éontains the essential ingredient, (If it is the enly 
sufhcient ingredient, the remainder will not restore 
the original state.) For instance, in a famous example 
of Bernard’s application of this method, after removal 
of the pancreas, rabbits waste away and die. If a difter- 
ent pancreas is transplanted anywhere into the body 
of such a pancreatectomized rabbit, it lives relatively 
normally, Therefore, the transplanted pancreas, even 
without its normal nervous connections, can maintain 
life. If an extract of pancreas is injected daily into a 
pancreatectomized rabbit, it too will live normally. 
Therefore, something in the extract is vital. Continue 
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this process of analysis and synthesis, and eventually, 
as is now well known, you will discover that the hor- 
mone insulin, manufactured by the islets of Langer- 
hans in the pancreas, counteracts the otherwise fatal 
disease of diabetes mellitus. 


3. SYNTHESIS BY MODEL 


We can test our understanding of a phenomenon 
by constructing a model. We build into the model the 
elements we think are important and also our concep- 
tion of the way these elements interact to produce the 
phenomenon. Such models can be purely theoretical, 
as in mathematical models of learning behavior, in 
which after a theoretical analysis we postulate the 
essential elements and processes involved in such a 
way that they can be described quantitatively. ‘Then, 
in situations which are simple enough to handle 
mathematically, we attempt to predict in an equation 
how the bchavior will change as the variables are 
manipulated. We can also construct physical anal- 
ogies, as an engineer does when he tries to simulate 
human behavior by building a computerized robot. All 
such models are attempts to synthesize a behavioral 
phenomenon through a model which embodies its 
esscntial elements. : 


4, SYNTHESIS BY PREDICTION 


If we have correctly analyzed the elements of a 
given form of behavior, we should be able to predict 
which variables will contral them and the way the 
bchavior will change as these variables are manip- 
ulated experimentally. This is the most common test 
of an analysis by synthesis. It lends itself very easily 
te theoretical analysis and in formal versions much 
akin to mathematical models has played a prominent 
rele in learning theory (e.g., Clark Hull’s [1943] hypo- 
thetico-deductive approach to laws of learning). 


5. Svnruesis py PARALLEL 


So far, none of the above methods has been very 
successful in enabling us to reconstitute complex ani- 
mal and human behavior from the real, simpler sub- 
components of behavior isolated so far (reflex, in- 
stinct). We may not be able to do so until we know a 
great deal more about how reflexes and instincts work. 
However, there is another method of synthesis, rather 
little exploited, which can supplement the previous 
ones. It allows immediate useful application to com- 
plex behavior of any knowledge we have obtained 
about simpler, experimentally isolated behavior sys- 
tems. It is “synthesis by parallel,” which in essence 
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says that something new is like something else that is 
already familiar. A parallel is a similarity, and the 
more detailed it is the more confidence we have that 
the similarity is not mere coincidence. One uses this 
method from the conviction that nature is parsimo- 
nious: if a given phenomenon works in a particular 
fashion, it is likely that the same method 1s used to 
produce other phenomena which up to now we have 
not recognized as being the same. Therefore, look for 
a parallel. After evaluating Skinner’s use of analysis 
and synthesis, I shall illustrate the use of parallels as 
a possible way of increasing our understanding of 
operant behavior. 


What was B. F. Skinner’s solution to the problem of 
how to apply analysis and synthesis to the understand- 
ing of learned motivated behavior? Following firmly 
in the footsteps of Descartes, Sherrington, and Pavloy, 
Skinner opted for a real simplification of behavior. 
However, the physiological method of transecting the 
nervous system yielded preparations whose behavior 
was too simple—spinal or decerebrate reflexes and 
postural changes seemed unsuitable to reveal the laws 
of learning because these preparations could no longer 
learn.. Moreover, trying to build theories of behavior 
from these overly simple preparations proved in most 
instances to be a waste of time. Therefore, instead of 
transecting the nervous system to purify the S-R corre- 
lation between environment and behavior, Skinner 
chose to simplify the environment. He put the organism 
into an isolated environment—an opaque, sound-insu- 
lated chamber where one or more stimuli could be in- 
troduced whenever the experimenter desired (Skinner, 
1956). In this, Skinner followed Pavlov, whose work 
on conditioned reflexes had demonstrated that in 
order to reveal lawful correlations between stimuli 
and conditioned reflexes it was absolutely essential to 
eliminate extraneous stimuli. Respondent autonomic 
responses do not act on the world; therefore, Skinner 
chose an arbitrary act (but only one), like pressing a 
bar or pecking a key, and rewarded the hungry or 
thirsty animal with a tiny amount of food or water 
each time it performed the desired act. In a simplified 
world of one stimulus and one response, it imme- 
diately becomes apparent that the presentation of a 
reinforcing stimulus to an appropriately motivated 
animal powerfully shapes its behavior. As will be dis- 
cussed more fully below, since the response and the 
reinforcement appear to be completely arbitrary, we 
have thus achieved a simplified prototype of adaptive 
high-level motivated behavior—a unit of behavior 
whose laws we can now study. 

Skinner then devised a “microscope’’ to study the 
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newly isolated ‘‘operant’’—the cumulative record. Us- 
ing the most advanced telephone relay circuitry then 
available, he was able to detect the correct response (a 
bar press or a key peck sufficient to close a micro- 
switch) automatically, to deliver the reinforcement 
instantaneously, and to record each response and rein- 
forcement as they occurred. By recording such re- 
sponses cumulatively, one could almost see the devel- 
oping shape of an animal’s expectations—for instance, 
on a fixed-interval schedule (where a reinforcement 
only occurs after a fixed time has elapsed since the 
previous one), the animal uniformly pauses in its re- 
sponding immediately after reinforcement, then grad- 
ually accelerates its responding as the probability of a 
reinforcement increases with time. 

Skinner used the cumulative record to insure the 
purity of his simplified preparation. When he got 
smooth curves he believed he had pure operant be- 
havior under the control of the reinforcement contin- 
gencies. In a way, this is like the electrophysiologist 
who, when dissecting a nerve bundle to isolate one 
fiber, watches the oscilloscope and dissects until the 
preparation responds with impulses all of the same 
amplitude. The all-or-none law (that a single neuron 
always fires with impulses of the same amplitude) thus 
assures the purity of the dissected nerve preparation. 
In a similar fashion, a smooth cumulative record is 
taken to mean that we have only one kind of behavior 
being recorded—each response follows the next so reg- 
ularly (here uniform frequency rather than amplitude 
is used) that they add up smoothly rather than dis- 
continuously. (If the curve is irregular, as is often true 
early in training, it frequently indicates that we have 
more than one act being used to press the bar. How- 
ever, see below for further discussion of the adequacy 
of this method.) 

Having experimentally isolated and purified the 
operant, Skinner then faced the problem of using it to 
understand the laws of learned motivated behavior. 
In his early work (1938) he used prediction as a 
method of synthesis. He formulated concepts such as 
the reflex reserve to embody the idea of a reservoir of 
responses that is affected both by central organismic 
states and past experience with reinforcement. The 
level of the reflex reserve determined the probability 
of an animal’s behavior in particular instances. 
Similar hydraulic models of instinctive behavior have 
been used in psychoanalysis (Freud, 1912) and in 
ethology (Lorenz, 1952). 

But Skinner soon came to feel that such attempts at 
theoretical prediction possessed the same drawbacks of 
fruitless speculation and lack of contact with the real 
phenomena of behavior that were met with in at- 


12 


Fig I. Traeitias dof three curves which report behavier in re- 
sponse tG0 a multiple fixed-interval fivéd-ratio schedule. One 
ef them was made by a pigcon, onc by a rat, and one by a 
monkey, (From Skinner, @ 1986 by the Amcrican Psychological 
ASSC1T ELGH. Reprinted by permission.) 


tcmpting to reconstruct complex behavior from phys- 
iological or hypothetical simplihcations. “A purely 
descriptive science is never popular. For the man 
whose curiosity about nature is not equal to his inter- 
est in the accuracy of his guesses, the hypothesis is thé 
vary lifé-blood of science’’ (Skinner, 1938, Pp 426), He 
therstere hreke with traditional forms of theory (Skin 
ner, 1950). The operant was simple—but not too 
simple. It was of sufficient complexity t6 embody the 
interesting phensmeéna of learning and motivation, 
Environmental analysis assured sufhcient simplicity to 
reyeal rehable S-R relationships that could form the 
framework of a scicntific description of behavior. In- 
stcad of using a mathematical model or prediction fer 
synthesis, Skinner used thé sueeassfiil control of be. 
haviar ac his validation criterion. ‘The phenomenon 
we are interested in (Icarncd motivated behavior) is 
clearly evident in the simplified world of the Skinner 
box, so all we have to do is describe the way the re- 
sponse varices as stimuli and internal states are varied. 
The fact that they do control the probability of re- 
sponse ouarantéés their validity. I call this “synthesis 
by success.” As shown in Figure 1, a given fixed-inter- 
val type of schedule (this is actually a multiple sched- 
ule: fixed intervals combined with fixed ratios) pro- 
duces virtually identically shaped smooth curves in 
several species; therefore, the laws being formulated 
have great generality, The procedure is a variant of 
“synthesis by model”—our simplified laboratory model, 
derived from animals, works when applied to humans 
in the real world. Skinner’s method thus applies 
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Francis Bacon’s (1620) dictum that we should engage 
in “experiments of fruit” as well as “experiments of 
light”—making useful working application of the 
scientific laws we are formulating. 

Bacon also suggested that we draw up “tables of 
discovery” describing the relationships we have 
worked out. By classifying these relationships, similar 
within category and different between categories, 
fundamental generalities governing each category 
should leap to the eye and mind, and valuable data 
will be gathered in the process. This nonspeculative 
form of data gathering is an empirical approach to 
experiment—it is concrete, not abstract; therefore, no 
one can dispute the facts generated by it. Such “bota- 
nizing” of behavior has great value when applied simul- 
taneously Over many species of animals, as in compar- 
ative physiology or taxonomic ethology. However, 
because taxonomy has not been explicit in the operant 
approach (but see Skinner, 1966), some behaviorally 
oriented workers become impatient with it and sus- 
pect its practitioners of application of the operant 
method in trivial instances, Because it is still in its 
early stages, the separate categories of operant analysis 
of behavior are not yet clearly apparent, so many do 
not see its theoretical value. Also, because operant 
terminology is not widely used, many psychologists do 
net see how the operant “school” adds more than 
simple technology for providing stable behavioral base 
lines. But operant methods work, so their application 
becomes more widespread, 


THE OPERANT AS A ERITERION FOR 
MOTIVATION 


In an éarliér discussion of this subject (Teitelbaum, 
1966), I pointed out: 


When we speak of purposive acts in humans, 
we mean behavior that is directed toward a goal 
and is accompanied by a corresponding motiva- 
tion to obtain that goal. The essential quality 
is the motivational statethe physiological state 
of events that corresponds to the urge to per- 
form a particular act, to obtain a certain object, 
or to produce a desired outcome. If we could 
be sure that such a state exists in animals during 
a given act, we could justifiably call that act 
motivated behavior. 

Clearly, if the response is a completely auto- 
matic consequence of the stimulus, we cannot 
speak of motivation. As long as a fixed built-in 
relation exists between a stimulus and a re- 
sponse, we have no justification for inferring the 
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additional existence of a motivational state un- 
derlying that response to the stimulus. Such a 
state may exist, but we can have no positive 
proof of it. (p. 566) 


By definition, therefore, a reflex excludes motiva- 
tion. It is unconscious, unlearned, and involuntary 
(Skinner, 1931). ‘To infer motivation we must break 
the fixed reflex connection between stimulus and re- 
sponse. By its very nature, the operant appears to 
do so. 


In effect, in any operant situation, the stimulus, 
the response and the reinforcement are com- 
pletely arbitrary and interchangeable. No one 
of them bears any biologically built-in fixed 
connection to the others. We arrange the experi- 
mental situation so that the response produces 
the reward and the animal learns the connection 
between them. Once having learned this rela- 
tionship, the animal reveals its motivation by 
the fact that it works to obtain the reinforce- 
ment. This is what all operant conditioning 
situations have in common: the animal’s motiva- 
tion to obtain the reinforcement... If an 
operant occurs, motivation exists. (p. 567) 


Thus by using learning to break the fixed reflex 
connection between stimulus and response, Skinner 
created an emergent unit of behavior—the operant— 
which could be experimentally isolated, whose laws 
could be studied in their own right, and which could 
serve as the prototype of all learned motivated be- 
havior. Because the degree of environmental control 
over the operant in the Skinner box is so great, the 
close S-R correlation is preserved and, with it, the 
scientific power of the laws describing it. Indeed, to 
avoid the pitfalls of speculation and to eliminate the 
“idols of the marketplace” described by Francis Bacon 
(1620) (the tendency to use words with surplus mean- 
ings to describe simpler phenomena), Skinner pre- 
ferred to eliminate entirely such constructs as motiva- 
tion and awareness. ‘The operant embodies them in 
clear-cut S-R relations, and a categorization of those 
relationships should form an adequate scientific basis 
for the control of operant behavior. 


PUZZLING OPERANTS 


As described above, the laws of the operant are re- 
markably well suited for application to humans (Mil- 
lenson, 1967) and have been highly successful when 
applied to human learning and motivation. However, 
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both in the laboratory, and in applied situations, work 
on animals has continued apace, and, with increasing 
frequency, seemingly paradoxical phenomena are be- 
ing demonstrated. Early reports of the “misbehavior 
of organisms’ came from Breland and Breland (1961, 
1966), students of Skinner who applied operant 
methods to the training of a wide variety of animals 
in situations designed to entertain the public. 
Chickens were taught to swing a bat to hit a ball out 
onto a “playing field,” raccoons or pigs to “save” 
wooden. coins or dollar bills in piggy banks, whales or 
porpoises to play with rubber balls, and so on. How- 
ever, particularly as such animals became better and 
better trained, their performance very often deteri- 
orated. Chickens would run out onto the “playing 
field” to chase the baseball they had just hit with a 
bat; raccoons would “wash” their coins instead of 
dropping them in the box, and pigs would root and 
toss their dollars rather than depositing them in the 
piggy bank. In all these instances, the interfering be- 
havior delayed the reinforcement, sometimes to the 
point where the animals underwent serious weight 
loss, since the conditioned acts were their sole means 
of obtaining food. 

Hineline and Rachlin (1969) pointed out that in 
many circumstances there is great dificulty involved 
in training a pigeon to peck a key to avoid electric 
shock, though it could readily learn to do so for food. 
Pigeon key pecking seemed still more perplexing 
when Brown and Jenkins (1968) demonstrated that 
contingent reinforcement with food was not necessary 
to train a pigeon to peck an illuminated key—merely 
using the key to signal the opportunity to eat food 
at brief intervals was sufficient to induce them to peck 
the key light, independent of the reinforcement, at 
very high rates. One might conceive of such “auto- 
shaping” as an example of “superstitious” responding 
(Skinner, 1948b), but the work of Williams and 
Williams (1969) on “negative autoshaping” (where a 
pigeon will learn and continue such key pecking even 
when each response actually prevents the reinforce- 
ment) makes this less tenable. (However, for evidence 
of operant control of autoshaped behavior, see Bar- 
rera, 1974.) ‘These behaviors seem to fit in the operant 
category but can be extremely difficult to shape, occur 
without reinforcement, despite reinforcement, or dete- 
riorate rather than improve with training. 

Puzzling phenomena are being found in other types 
of learning situations. When a rat feels sick after 
poisoning or exposure to X-rays, it will develop an 
aversion for a novel taste (such as saccharin) but does 
not link the illness to other stimuli such as lights or 
sounds, which were equally available for association 
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(Garcia & Koelling, 1966). However, in the identical 
situation, if the negative reinforcement is the pain of 
electric shock, the light or sound becomes the danger 
signal, whereas the novel taste is ignored. So taste for 
a rat seems physiologically tied to the nausea and 
malaise of poisoning and X-ray exposure, but lights, 
sound, and locations seem linked to the peripheral 
pain caused by electric shock. More visual species, 
such as birds, link the sickness of poisoning to the 
color of a nutrient solution rather than to its taste 
(Wilcoxon, Dragoin, & Kral, 1971), 

‘Taste aversion learning is special in other ways as 
well, It seems far more powerful than the instrumen- 
tal learning which has served as the paradigm of con- 
ditioning for s9 many years. In traditional condition- 
ing cxpcriments, an association between an arbitrary 
stimulus and tha sight 6f food can be formed only if 
they occur yirtually simultancously, If there is morc 
than a few scconds’ delay, it is very difficult to produce 
4 learned linkage between them. Rats, however, can 
associate sickness with a novel taste, even if the tastc 
occurred as much as 12 hours earlier (Smith & Roll, 
1967). (Histamine secreted by the body in reaction to 
ray exposure scems to be involved in such learning 
and may be related to these powerful effects—Levy, 
Carroll, Smith, & Hofer, 1974.) 

To cthologically oriented workers, such phenomena 
indicate that it may be premature to seck general laws 
of learning, including those of the operant. They sug- 
gest that a deceptive generality may result when work 
ig limited to too few species (Beach, 1950; Bolles, 
1970; Rozin & Kalat, 1971; Seligman & Hager, 1972: 
Shettleworth, 1972; Tinbergen, 1951). For some 
species the innate connections between certain stimuli 
and responses may be too reflexive to serve in operant 
behavior. They involve stimulus-bound., nonarbitrary, 
and noninhibitable acts. Yet in an éperant paradigm 
manipulated by reinforcement contingencies. Are they 
eperants? If 50, what is wrong with our concept of 
opcrants? If not. what are they, and why do many of 
them often seem to obey the laws of reinforcement? 


SIMILAR PUZZLES IN MOTIVATED 
BEHAVIOR 


As described above, an operant act is proof of the 
existence of motivation—if a completely arbitrary 
operant occurs, motivation exists. If we are running 
into difficulties with our conception of the operant, 
the same must be true of our concept of motivation. 


LEVELS OF INTEGRATION OF THE OPERANT 


By examining motivation, we may gain greater insight 
into operant behavior. 

Since the fundamental work of W. R. Hess (1932, 
1954, it has been known that electrical stimulation of 
the brain of an unanesthetized animal can elicit in- 
stinctive behavior patterns such as mating, feeding, 
drinking, or fighting. A sated rat stimulated through 
implanted electrodes in the lateral hypothalamus will 
eat large quantities of food (Hoebel, 1971). If this is 
done every day, the rat will overeat and even become 
obese (Steinbaum & Miller, 1965). Such an animal will 
learn a new operant or perform a previously learned 
one (€.g., running a maze or pressing a bar to get 
food) during stimulation, thus supporting the idea 
that this is truly motivated behavior, not merely some 
kind of motor automatism (such as chewing), where 
the ingestion of foed is an accidental by-product of 
the behavior, rather than a desired outcome (Coons, 
Levak, & Miller, 1965: Mendelson & Chorover, 1965; 
Miller, 1971). The same is true of thirst (Andersson & 
Wyrwicka, 1957) and other species-typical behaviors 
(Roberts, 1970), 

However, a fundamental problem in our concep- 
tion of motivated behavior has been identified in the 
work of John Flynn and his colleagues (e.g., Flynn, 
1973). ‘They implanted electrodes in the lateral hypo- 
thalamus of cats. Many normal cats do not ordinarily 
kill rats. However, when stimulated in the lateral 
hypothalamus, such cats chase and strike or bite a rat, 
usually killing it if the current is left on. Two forms 
of such attack were seen. One was accompanied by a 
display of rage (retraction of the lips, exposure of the 
canine teeth, piloerection and arching of the back, 
hissing, growling, pupillodilation—the typical “Hallo- 
ween cat’). In the other form, which Flynn and co- 
workers called the “quiet biting attack,” the cat 
moved swiftly about the cage with its nose low to the 
ground, back somewhat arched, and hair slightly on 
end, and usually went directly to the rat and bit it 
viciously. In the absence of an attack object, neither 
form of diréctéd attack would occur. ‘Therefore, stimuli 
provided by the rat are necessary before electrical 
stimulation can evoke attack behavior. What are 
these stimuli? 

In their early studies, Flynn and _ his colleagues 
took a Sherringtonian approach to this problem. The 
final component in the attack sequence is the killing 
bite. Which stimuli elicit it? They restrained the in- 
tact cat so that the animal could lunge, or turn its 
head and bite, but could not otherwise walk around. 
Without hypothalamic stimulation, touch around the 
mouth and on the lips evoked no response. However, 
during electrical stimulation, touch around the 
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Fig. 2, (Left) The cat’s muzzle. (Center) Maximum extent of 
the maxillary sensory field for head-orienting responses during 
relatively intense hypothalamic stimulation. A similar mandi- 
bular field has not yet been mapped in detail. (Right) Maximum 
extent of the sensory field for the jaw-opening response during 
relatively intense stimulation. (From MacDonnell & Flynn, 
1966a. © 1966 by the American Association for the Advance- 
ment of Science.) 


mouth, on the side contralateral to the stimulation, 
evoked head turning toward the stimulus (see Figure 
2), When the lips contacted the stimulus, mouth open- 
ing and biting occurred. Increasing the strength of 
hypothalamic stimulation increased the extent of the 
sensory field around the mouth and on the lips from 
which the response could be evoked (MacDonnell & 
Flynn, 1966a). Conversely, if these sensory fields were 
denervated by section of the appropriate branches of 
the trigeminal nerve, touch stimuli were no longer 
effective in evoking head turning, mouth opening, and 
biting, but electrical lateral hypothalamic stimulation 
could still evoke attack: the cat pounced on the rat 
and lowered its head for the killing bite—but then did 
not open its mouth when it contacted the rat—“kiss- 
ing” rather than biting. ‘The cat could open its mouth 
(it did so normally when eating food spontaneously), 
but not when stimulated to attack and kill (MacDon- 
nell & Flynn, 1966b). 

Implicit in this finding is a paradox with poten- 
tially important implications for our thinking about 
motivated behavior. On the one hand, if the operant 
is a learned arbitrary act, its occurrence depends upon 
(1) the memory of past response-reinforcement con- 
tingencies; (2) the central motivated state which makes 
that outcome reinforcing, and (3) the expectation that 
the operant will continue to produce the reinforcing 
stimulus. A cat 1s motivated to kill a rat if the cat will 
press a lever or run a maze to be presented with a rat 
which it then kills (Roberts & Kiess, 1964). 

On the other hand, we can take an ethological view 
of a cat’s rat killing. We can assume that there are 
specific fixed action patterns built into the cat’s ner- 
vous system. ‘They are selectively potentiated by hor- 
monal or other internal states and released by partic- 
ular, somewhat complex stimuli, called sign stimuli. 
Each S-R fixed action pattern forms a segment in a 
chain of behavior which we call the instinctive act. 
To account for rat killing by a cat, we assume that (1) 
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the sight of the rat (its color, small size, and move- 
ment) and its smell activates stalking behavior; (2) 
when close to the rat, the cat is stimulated to pounce, 
to swipe at the rat with its paws, and to lower its head 
for the killing bite; (3) when the cat’s whiskers or 
muzzle contact the rat’s fur or skin, its head turns and 
brings the lips into contact; (4) touch on the lips 
elicits mouth opening; and (5) touch and taste stimu- 
lation of the mouth and tongue evoke biting and 
swallowing. 

This description dovetails very well with the find- 
ings of Flynn and co-workers described above and 
their additional work on the releasing effects of visual 
and other stimuli on the components of electrically 
evoked attack (Bandler & Flynn, 1972; Flynn, Ed- 
wards, & Bandler, 1971). ‘The ethological view is 
strongly supported in studies of birds and fish in 
which removal of a stimulus in the S-R chain aborts 
the instinctive behavior pattern (Tinbergen, 1951). 

But this view implies that the instinctive act is not 
outcome-dependent. The eliciting stimulus, not the 
reinforcement, determines the response. How can we 
reconcile this with our view that rat killing by a cat 
is outcome-dependent? 

Perhaps electrically evoked attack is not controlled 
by all the variables controlling rat killing in a normal 
cat. Electrically evoked cating in rats seems more 
stimulus-bound and more stereotyped than normal 
hunger (Valenstein, 1973; Valenstein, Cox, & Kako- 
lewski, 1968). If this is correct and if normal rat kill- 
ing is determined by the reinforcement rather than 
the sign stimulus, then an unstimulated cat killing 
spontaneously, even with denervated mouth and lips, 
should open its mouth, bite, and kill a rat when 
hungry or when provoked to rage by pain. This 
experiment is theoretically very important and should 
be carricd out, 

Suppose that, in the trigeminal-sectioned cat kill- 
ing a rat spontancously, the killing kiss rather than a 
bite occurs. Does this mean that killing is not moti- 
vated—i.e., not outcome-dependent? Not necessarily— 
our view of the reinforcement may have been incor- 
rect. Perhaps each sign stimulus in the ethological 
chain may be reinforcing. Part of its releasing action, 
particularly in the experienced animal, may be due to 
the memory of the reinforcement provided by that 
stimulus in the past. ‘This means that such an animal 
should press a lever to gain the opportunity to swipe 
at a rat, to pounce on it, or merely to see and chase 
it. In fish, for instance, sign stimuli can be reinforcers. 
A Siamese fighting fish will learn to press a lever for 
the mere sight of another fighting fish (Thompson, 
1963). In their mating dance, a male stickleback will 
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perform an operant for the opportunity to see a recep- 
tive female whom he then courts (Sevenster, 1973). 
There is still another way of reconciling sign stimuli 
with reinforcement, but it must wait till we consider 
the evidence on motivation as revealed during re- 
covery from brain damage. 


RECOVERY FROM LATERAL 
HYPOTHALAMIC LESIONS 


As we have seen from the work of MacDonnell and 
Flynn (1966a), electrical stimulation in the lateral 
hypothalamus opens peripheral sensory fields around 
the mouth whose stimulation then yields head orien- 
tation and biting, reflexive components of the cat’s 
IVMCTINCFIVE attack pattern. Increasing the intensity of 
the stimulation expands the ficlds. Docs latcral hypo- 
thalamic damage shrink such fields and prevent nor- 
mally affactive stimuli from acting on them? The 
lateral hypothalamus is involved not only in attack, 
but also in eating. Does the lateral hypothalamic syn- 
drameé of aphagia and adipsia depend, in part, on loss 
of responsiveness to sensory stimuli? 

In order to answer these questions, Marshall, 
Turner, and ‘Teitelbaum (1971) applied a series of 
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simple neurological tests to normal rats. A normal rat 
investigates a stimulus by orienting its head toward it. 
This natural response was used to determine the 
responsivity of rats before and after lateral hypothal- 
amic damage. For example, to test vision on each side 
of the body, a 2-in. square (5 x 5 cm) piece of white 
or yellow cardboard was moved in front of each eye. 
Normal rats typically turn their heads toward this 
visual stimulus. To test olfaction, they looked for 
head orientation to a 14-in. cube of chocolate held in 
forceps or to a cotton swab soaked in Mennen shaving 
lotion (both of which elicited approach) or to an 
ammonia-soaked swab (which elicited approach fol- 
lowed by turning the head away). 

In an exact converse of the results of MacDonnell 
and Flynn (1966a), damaging the lateral hypothal- 
amus on one side profoundly impaired the rat’s abil- 
ity to oricnt to stimuli on the side contralateral to 
that of thé lesion (see Figure 3). Rats with unilateral 
lesions initially showed no orientation to contralateral 
visual, olfactory, whisker-touch, or somatosensory 
stimulation, whereas they responded promptly to the 
same stimuli presented ipsilaterally. Rats with bi- 
lateral lesions showed impaired responsivity to sensory 
stimuli on either side. 

Although the precise nature of such sensory neglect 
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Fig. 3. A rat with unilateral 
(ight) lateral hypothalamic 
damage shows precise head ori- 
entation and biting to various 
kinds of stimuli (whisker touch, 
odor, body touch) on the ipsi- 
lateral side (pictures at left) 
while neglecting the same 
Stimuli presented contralater- 
ally (pictures at right). (From 
Marshall, ‘Turner, & Teitelbaum, 
1971. @1971 by the American 
Association for the Advancement 
of Science.) 
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needs further behavioral analysis (see Turner, 1973), 
several observations suggest that the orientation im- 
pairment is neither a total motor paralysis nor an in- 
ability to sense the stimuli. In normal grooming, a 
rat usually starts by grooming its face and _ head, 
turns and grooms one side and flank, and terminates 
the sequence by grooming the opposite side and flank. 
Thus during normal grooming the rat’s head is turned 
and oriented to one or the other side side just as it is 
during normal orientation to tactile stimuli from that 
side. After unilateral lateral hypothalamic damage, 
the rat grooms first the ipsilateral, then the contra- 
lateral side of the body. However, even seconds after 
grooming the side contralateral to the lesion, the rat 
ignores tactile stimuli to that side and fails to turn 
toward them. Such failure to respond to stimuli, even 
though the animal can perform the necessary head 
movements, suggests that the deficit is more sensory 
than motor. However, the deficit does not resemble 
deafferentation, because autonomic (respiratory 
changes) and skeletal reflex (eye closure, tooth chatter- 
ing) behaviors often occurred when a stimulus was 
presented on the contralateral side. ‘The deficit seems 
to be more of an inability on the rat’s part to inte- 
grate the sensory information with the adaptive motor 
patterns involved in orienting to a stimulus (Marshall 
& Teitelbaum, 1974; Turner, 1973). 

Such sensory neglect can drastically affect the in- 
stinctive behavior patterns involved in eating, drink- 
ing, and attack. Bilateral lateral hypothalamic lesions 
produced total aphagia and adipsia lasting as long as 
9 days, followed by the usual stages of recovery. 
Analysis of the recovery of orientation to sensory 
stimuli showed that the transition from Stage I (com- 
plete aphagia) to Stage II (accepting only highly 
palatable foods) occurred on the same day or shortly 
after direct head orientation to olfactory stimuli and 
whisker touch first appeared. Rats with unilateral 
lesions that were tested for side preference in feeding 
generally took more food from the container located 
in the ipsilateral field, though preoperatively no such 
preference had existed. Similarly, after unilateral 
lateral hypothalamic lesions, rats that normally killed 
mice ignored the mouse when it was in the contra- 
lateral field. However, as soon as the mouse moved 
across the midline into the ipsilateral field, the rats 
showed oriented biting attack. 

In summary, the evidence from electrical stimula- 
tion or ablation strongly suggests that the role of the 
lateral hypothalamus in the control of motivated be- 
havior is at least in part due to its ability to potentiate 
the action of peripheral stimuli in eliciting the fixed 
instinctive action patterns involved in eating, drink- 
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Fig. 4. Stages of recovery seen in the lateral hypothalamic 
syndrome. (The critical behavioral events which define the 
stages are listed on the left.) (From Teitelbaum & Epstein, 
1962.) 


ing, or attack. But where does the operant fit into this 
picture? 


PARALLEL BETWEEN RECOVERY AND 
DEVELOPMENT IN THE LATERAL 
HYPOTHALAMIC SYNDROME 


Some insight into the role of the operant comes 
from the analysis of the stages of recovery from the 
aphagia and adipsia that result from bilateral lateral 
hypothalamic damage. The pattern of behavioral re- 
covery is summarized diagrammatically in Figure 4. 
(For a detailed analysis of the homeostatic mecha- 
nisms in the lateral hypothalamic syndrome, see 
Epstein, 1971.) The striking fact about this syndrome 
is that every lateral hypothalamic animal shows the 
same sequence of recovery. Depending on the lesion 
size and accuracy of placement, animals recover more 
or less rapidly. Also, although they may start their 
recovery at any point in the sequence, the progression 
irom that point follows an invariable pattern. ‘There- 
fore, the pattern of recovery could indicate a basic 
process of neural reorganization. 

A newborn suckling rat ingests milk but refuses 
water. At weaning, although they eat dry food and 
drink water, infants still do not respond fully to de- 
hydration (Adolph, 1957; Heller, 1947, 1951; Krécék 
& Krécékova, 1957). ‘This resembles some of the symp- 
toms seen during the various stages of recovery of food 
and water regulation in the adult rat after lateral 
hypothalamic damage (Teitelbaum & Epstein, 1962). 
In a sense the lateral hypothalamic rat during part of 
its recovery is like an infant rat. 

If a parallel exists between adult recovery and in- 
fant development, each stage in recovery from the 
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Fig. 3, Gomparison of the development of eating and drinking 
in rats' infancy and their recovery after hypothalamic lesion 
in adult. (The upper right half of each block represents the 
recovering lateral hypothalamie rat, and the lower left half 
the growing thyroidectomized or starvation-stunted rats. Uni- 
form gelering in cach full block indicates similar responses in 
recovery and development.) (From Tcitclbaum, Cheng, & Rozin, 
19692.) 
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lateral hypothalamic syndrome should be reflected in 
proper sequence during development in infaney. To 
study this in detail, we slowed down the course of 
normal development by thyroidectomizing infant rats 
at birth or shortly thereafter and studied their repu- 
latory capacity at the norinal weaning age of 21 days 
(Teiteleaum, Cheng, & Razin, 1969a, 1969b). 

At that age, thyroidectomized weanling rats dis- 
played every stage of the lateral hypsthalamic syir- 
drameé. If preatly rétarded in developmen, (as reflected 
in their hedy weights at weaning), some were com- 
pletcly aphagic and adipsic when offered wet palatable 
foods or ordinary food and water. Others, more fully 
developed at weaning, accepted wat palatable foods 
but did not eat enough to maintain their weight. If 
this stage lasted too long, they died. Other weanlings, 
even less retarded, gained weight and regulated their 
caloric intake of a liquid diet (they at least doubled 
their volume intake when the caloric density was one- 
third as great), but were still adipsic and would have 
died (some did die) when offered only dry food and 
water. Finally, the least retarded weanlings accepted 
dry food and water, but drank only when they ate; 
like recovered adult rats, they were prandial drinkers 
and did not drink in response to body dehydration. 
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They were finicky, and although they ate more in the 
cold they did not eat more in response to glucopriva- 
tion. As shown in Figure 5, depending on the degree 
of retardation, every stage of the adult lateral hypo- 
thalamic syndrome was seen in thyroidectomized rats 
at weaning. 

In a separate experiment (Cheng, Rozin, & Teitel- 
baum, 1971), we severely starved neonatal rats through- 
out the suckling period by limiting their access to the 
mother. Control] litters with unlimited access to their 
mother developed normally. The starvation-retarded 
runts, when tested at weaning (21 days of age), showed 
every stage of the lateral hypothalamic syndrome. 
Like the thyroidectomized infants,. if they survived 
they progressed through the various stages as they 
developed. 

One may immediately object to drawing a parallel 
between recovery from lateral hypothalamic damage 
and stages of development of the regulation of food 
and water intake in infancy. After all, lateral hypo- 
thalamic animals die of starvation and thirst, whereas 
infants do not, But we do not ask newborn infants to 
eat like adults—they nurse at the breast until they can 
be weaned. It follows that adult lateral hypothalamic 
animals should nurse like infants, and, without‘ any 
other maintenance, should be able to keep them- 
selves alive until they recover additional regulatory 
capacitics. Figure 6 shows such an experiment. Merely 
by offering repeated access to a substitute mother (a 
milk bottle with a modified drinking spout, provided 
at frequent intervals day and night), Dr. Cheng (un- 


Fig. 6. A lateral hypothalamic adult rat, otherwise totally 
aphagic for nine days, nurses reflexively at a milk-containing 
baby bottle (with modified nipple). Frequent feedings allowed 
it to ingest sufficient quantities to stay alive and recover to 
the anorexic stage. 
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published results) showed that an otherwise com- 
pletely aphagic and adipsic adult lateral hypothalamic 
rat in Stage I was able to ingest sufficient quantities 
to keep itself alive for 9 days. Then it progressed to 
Stage II; 1.e., 1t ate wet palatable baby foods from a 
dish. Thus otherwise aphagic adult animals, like in- 
fants, can ingest sufficient food to stay alive by nursing 
reflexively. 


STAGES OF RECOVERY AND 
DEVELOPMENT OF THE HUMAN GRASP 


If the parallel between adult recovery and infant 
development emerged from the study of another brain 
system, independent of feeding and drinking, one 
could be more sure of its validity. From work on a 
different species and on a different brain system, a 
very similar parallel has independently been demon- 
strated. ‘I’. E. Twitchell studied the recovery of move- 
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Fig. 7. Evolution of the automatic grasping responses of in- 
fants. (Twitchell, 1965.) 
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ment following cerebral hemiplegia in human _pa- 
tients. His work stemmed from the earlier observation 
of Seyffarth and Denny-Brown (1948), who pointed 
out that several types of reflexive movements could 
be elicited from a paralyzed limb although the limb 
could not be used for voluntary movement. In the 
hand, for instance, they identified three types of re- 
flexive grasping—the traction response, the true orasp, 
and the instinctive grasp reaction. In studying the re- 
covery of stroke patients, Twitchell (1951) discovered 
that these grasping automatisms could always be 
identified and that they represented distinct sequen- 
tial stages in the recovery of voluntary control of 
movement. 

In subsequent studies, Twitchell (1965, 1969, 1970) 
found that very similar stages to those seen in recov- 
ery from hemiplegia in adults can be demonstrated in 
the normal development of voluntary control of grasp- 
ing in newborn infants (Figure 7). In a human sys- 
tem as well as in the rat, adult recovery recapitulates 
infantile ontogeny. 


TRANSFORMATION OF SENSORY CONTROL 
OVER AN APPROACH RESPONSE 


The concept of stages of sensory control over a re- 
sponse is implicit in the parallel between recovery 
and development. During recovery of the grasp, spinal 
proprioceptive mechanisms in the form of tendon 
jerks and increased stretch reflexes (spasticity) occur 
first. These are modified by tonic neck and vestibular 
body-righting reflexes into the traction response. The 
recovery process then proceeds to the next stage, in 
which flexion of the fingers can be obtained reflex- 
ively by a distally moving tactile stimulus to the 
medial palm. This is the true grasp reflex. In a sense, 
as ‘I’witchell (1951) points out, the tactile grasp reflex 
facilitates the proprioceptive grasp (the traction re- 
sponse). Eventually, the sight of a stimulus is enough 
to cause the hand to reach out and grasp it (Iwitchell, 
1970). Presumably, with sufficient recovery, true “‘op- 
erant” use of the hand returns. 

The sequence of development and recovery of 
sensory controls over an approach reaction may be 
more general than is now suspected. For instance, a 
similar sequence seems to govern the gaping response 
of the newborn thrush. Immediately after hatching, 
the stimulus modalities that appear to elicit the gape 
are proprioceptive and vestibular; the chicks gape 
when the parent bird alights on the nest, which 
shakes the head and body of each chick. The gape is 
directed vertically upward (vestibular control), inde- 
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Fig, 8, Nestling thrushes gaping upward in response to visual 
stimulus but not directed by it. (From Tinbergen. 1951. By per- 
mission of Oxford University Press.) 


pendent of the position of the parent bird, ‘Touching 
the side of the facc near the mouth also elicits the 
gape, which is still directed upward. Later, the sight 
éf thé parént bird (or the approaching hand of the 
experimenter—see Figure $) elicits gaping, but it 1s 
still directcd upward. Finally, the gape is directed 
toward the visual stimulus (visual control now not 
only triggers the response but also guides its orienta- 
ti6i— Tinbérgan & Kuenén, 1939). Although Tin- 
bergen and Kuenen did not investigate it, 1t 1g possi- 
bic that, still later, operant gaping as an emitted 
responsc would develop in such birds. If so, then, as 
in recovering hemiplegic humans and developing nor- 
mal infant rats, the end result of such a sequential 
transiormation of control of 4 reaction pattern would 
be voluntary or selfinitiated action—an operant reé- 
sponse, If there is generality in the sequence of sensory 
transformation of a reaction pattern, we may have a 
beginning insight into a mechanism of transformation 
of reflexes into operants during infant development or 
adult recovery of function. 


STAGES OF RECOVERY AND 
DEVELOPMENT OF LEARNED BEHAVIOR 


If these concepts apply to the study of operant be- 
havior, we should be able to demonstrate stages of 
integration of learned responses during recovery from 
brain damage and in early infantile development. 
For instance, Glavcheva, Rozkowska, and Fonberg 
(1970) studied the lateral hypothalamic syndrome in 
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dogs. As in other animals, such lesions produce 
aphagia and adipsia with eventual recovery. Operant 
responding for food after such lesions also disappears 
and eventually recovers (Rodgers, Epstein, & ‘Teitel- 
baum, 1965). When a normal dog swallows food, gas- 
tric contractions are inhibited and the stomach re- 
laxes. This is its action as an unconditional stimulus. 
After training sessions presenting the sound of a 
metronome for 50-60 sec preceding the presentation 
of food, the sound of the metronome alone can pro- 
duce stomach relaxation and inhibition of motility. 
After bilateral lateral hypothalamic damage, Glav- 
cheva et al., showed that such conditioning is com- 
pletely abolished. Stages of recovery of conditioning 
can then be demonstrated. At first, there is stomach 
alonia—a complete loss of tonus and spontancous 
hunger contractions. This stage generally corresponded 
to complete aphagia or anorexia with adipsia. ‘Then 
a stage of rhythmic automatic gastric contractions ap- 
pears, but the stomach still seemed completely cut off 
[rem influence by the rest of the nervous system—the 
unconditioned relaxation effect of food in the stom: 
ach and the conditioned effect of the sound were both 
absent. In the next stage of recovery of control of 
gastric motility, the unconditioned relaxation effect 
of food reappeared, but the conditioned stimulus 
(sound) was still ineffective. Finally, in the last stage, 
the conditioned stimulus regained its effectiveness. 

It would be interesting to investigate, both in re- 
covering lateral hypothalamic dogs and in newborn 
puppies, whether there may be a similar sequence of 
stages of conditionability of sensory control over gas- 
tric motility (kinesthetic, vestibular, touch, and vision). 
Since salivary conditioning is also impaired after 
lateral hypothalamic damage (Rozkowska & Fonberg, 
1972), the sequence of recovery and development of 
its conditioned scnsory control should be investigated 
as wcll. 

It is suggestive (see Table 1), as support for such 
a sequence, that Russian investigators find 9 rather 
similar, immutable sequence of sense modalities im 
the development of the capacity for Pavlovian condi- 
tioning in human infants (Kasatkin, 1960). 


STAGES OF ENCEPHALIZATION 
OF THE OPERANT 


To understand any behavioral phenomenon, we 
must view it as we would a stage of development or 
recovery. These only make sense as a transformation 
from the stage of integration that preceded it (a lower 
level of encephalization) toward a higher level of inte- 
eration. Like the nervous system whose action it re- 
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Table 1 Developmental Sequence of Conditionability as a Function of Type of Conditional Stimulus 
SENSORY ANALYZERS 
Vestibular Auditory Tactile Olfactory Taste Visual 
SIMPLE 
CONDITIONED 
RESPONSE 
(e.g., change (e.g., complex (e.g., tickle, (e.g., oil of (e.g., 5% sugar (e.g., colored 
of body 65-db tone) sole of foot) roses Or solution) light) 
position) lavender) 
First 
appearance 8d 15d to 24d 28d 28d 35d 40d 
Semistable 
response 15d 40d 45d 45d 45d 2m. 
100% stable 
response 20-24d to Im 35d to 2m 2m 2m 2.5m 3m 
SIMPLE DIs- 
CRIMINATION 
(e.g., up— (e.g., 1 octave (e.g., right (€.g., LOSES (e.g. 1% from  (e.g., red from 
down from higher or from left foot) fromlavender) 5% solutions) — green or blue) 
sideways) lower from 
CS-+) 
First 
appearance Im zm 2m 2m 2 5m 8m 
Semistable 
response 1.5m 9.5m 2.5m 2.5m 3m 3.5m. 
100% stable 
response 2m 3m 3m. om om. 3.510 


Note: Ages appear as days (d) or months (m) (Kasatkin, 1960). 


Reprinted with permission of Macmillan Publishing Co., Inc. from Infancy and Early Childhood, Y. Brackbill (ed.). © 1967 by the 
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Hects, behavior is a hierarchically organized structure. 

This point of view may help resolve some of the 
apparent paradoxes in behavior which now exist. A 
striking example of such a resolution comes from 
the study of sexual behavior (Beach, 1966). Adult 
female rodents in estrus show lordosis when stroked 
around the flank or genital region. This has long 
been used as a criterion of adult sexual receptivity. It 
was therefore extremely surprising to discover that 
newborn guinea pigs (male or female) displayed such 
lordosis in the first few hours, or even for some days, 
after birth. Was this due to the surge of female estrual 
hormones in the mother that occurs at parturition? 
No—pups born of mothers ovariectomized midway 
through their pregnancy showed the same lordosis 
when stroked (Boling, Blandau, Wilson, & Young, 
1939). Careful behavioral analysis revealed that in in- 
fant guinea pigs, lordosis is the basic posture assumed 
reflexly during urination evoked by the mother’s lick- 
ing of the infant’s genital region as it stands and 
nurses underneath her. Indeed, such tactile stimula- 
tion is necessary for survival in infancy (Beach, 1966), 
because, as in spinal adult animals shortly after 
surgery, bladder distention does not produce relaxa- 


tion of the urinary sphincter. With recovery from 
spinal shock in the adult, such control by distension 
returns. Apparently, in the newborn infant the tactile 
lordosis reflex promotes urination. Later in develop- 
ment, this reflex disappears and bladder distension 
becomes sufficient to produce urination. In the adult 
female, estrus hormones reinstate the lordosis reflex. 
Just as hypothalamic stimulation involved in killing 
opens and expands tactile sensory fields around the 
mouth (MacDonnell & Flynn, 1966a), estrual hor- 
mones elicit lordosis through tactile sensory fields 
around the genitals (Komisaruk, Adler, & Hutchison, 
1972). ‘Therefore, the reappearance of infantile re- 
flexes can occur not only in brain damage, but also in 
the normal action of hormones which produce in- 
stinctive behavior. 

Other instinctive behavior patterns go through suc- 
cessive levels of ontogenetic transformation of nervous 
control. Each stage of transformation in the hierarchy 
allows new controls to modify the action of the lower 
stage of integration. For instance, as McGinty (1971) 
has pointed out (see Figure 9), the various stages of 
sleep may be conceived of as separate stages of en- 
cephalization of the sleep system. It is well known that 
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| Selection 


Diurnal pattern integration 


Discrimination of drive 
Suppression of competing drives 


Feeding: Sleep: 
i! Approach and Preparation 


Exploration 


Locomotion to sleep site 
Assumption of sleeping posture 
Drowsy state 


Locomotion to food source 
Gelection of food 
Bringing food to mouth 


itt Consumption 


Quiet sleep 
Ingestion 


Active sleep 


Fig: 5. Appetitive behavior chain in the adult. (From Metinty, 
1971} 


newborn infants are capable mainly of rapid eye 
movement (REM) sleep. This seems a relatively reficx- 
rv¢, obligatery form of sleep: the infant has little con- 
trol over it (it cannot inhibit slecp) and lapses readily 
inte it whén its stomach is full and it is not cold or 
wet. When its nervous system matures, slow-wave 
sleep emerges as a separate stage with consequent ine 
hibition of the amount of REM sleep. (In the adult 
decerticatc animal, REM represents about 40 percent 
of total sleep, versus about 20 percent for a normal 
eat—Jouvet, 1962; and personal communication.) It 
gher stage of encephalization limits 
access to the less encephalized stage preceding it in 
ontogcny. This may be adaptive: the morc reflexive 
form of sleep (REM) is suppressed and partially res 
placed by slaw-wave glaap whieh in turn can be more 
readily inhibited, thus allowing more opportunity fdr 


1g ag though each hi 


adaptive waking bchayvier. 

When the normal adult animal goes to sleep, it 
poes through a fixed scquence of stagcs=it must cnter 
slow-wave sleep before it can proceed to the stage of 
REM aléap. Perhaps, therefara, not only can slow- 
wave sleep inhibit REM, but also duriiig éach normal 
bout ef adult sleep, slow-wave sleep may serve as a 
selective facilitating mechanism in the transition from 
waking to REM sleep. This resembles the way the 
true grasp allows touch to act as a facilitator in the 
release of the more infantile proprioceptive grasp— 
the traction response (Twitchell, 1951, 1970). 

The above example of sleep behavior is important 
because it suggests that in adult behavior, the normal 
act of going to sleep involves a rapid, reversible de- 
encephalization of function: sleep, as an instinctive 
behavior pattern proceeds from the appetitive behav- 
ior patterns involved in searching for an appropriate 
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place, circling and curling up, lowering the head, clos- 
ing the eyes, etc., and progresses through successive 
stages to the consummatory act—the achievement of 
REM sleep. Like food and water intake, the amount 
of REM appears to be regulated homeostatically; e.g., 
after deprivation of REM, more of it occurs (is con- 
sumed?) when free access is once again possible 
(Dement, Henry, Cohen, & Ferguson, 1967). 

Perhaps every adult motivated behavior pattern 
proceeds in a similar sequence from operant and Pav- 
lovian approach behavior to the reflex consummatory 
act. In the study of instinctive behavior, ethologists 
have roughly distinguished such levels of nervous 
contrel by referring to appetitive versus consumma- 
tory behavior (Tinbergen, 1951). Similarly, in man, 
the appetitive act of reaching for food is More Operant 
in its character (less stereotyped, more subject to 
willed inhibition) than is a later act in the chain— 
rellexive chewing —an ned much more so than swallow- 
ing, which is completely reflexive once the food has 
reached a point far enough back in the mouth. Each 
behavioral event is followed by the next more auto- 
matic one until the final molecule-to-molecule match 
is achieved (Breland & Breland, 1966). 

Operant behavior may have a similar hierarchical 
structure. built up by successive levels of transforma- 
tion in development. ‘The adult operant may be 
viewed as the initial, most arbitrary (learned) part of 
an approach sequence toward a fixed consummatory 
act. During that sequence, the behavior becomes pro- 
gressively deencephalized (more stimulus-bound, less 
subject to willed inhibition), culminating in reflexive 
consummatory behavior. 

Of what use is it to think this way if the sequence 
reels itself off sa quickly in an operant act that we 
cannot isolate its parts? In a way, this is like the prob- 
lem faced hy George Wald in isolating the transition 
states in the transformation of rhedepsin te vitamin 
A (Wald, 1968). Lumi-rhodopsin and meta-rhedepsin 
were isolated by literally freezing the reaction—at a 
temperature below —40°C, the reaction only pro- 
ceeded to lumirhodopsin. Cradual warming allowed 
it to proceed to meta-rhodopsin and eventually to 
reach its final transformation state—vitamin A plus 
opsin. In like manner, stages of encephalization of 
behavior can be “frozen” in recovery after brain dam- 
age or in normal development in infancy. Starvation 
or thyroidectomy in infancy can slow the development 
process still further. 

Is it possible to reveal such stages during normal 
motivated acts, where, if they exist, they are so fleeting 
and evanescent that we cannot now isolate them? Per- 
haps the apparent errors in operant behavior can 


Philip Teitelbaum 


help to reveal the stages of transformation. In the 
ethological study of instinct, ‘“‘mistakes” in apparently 
purposive acts are useful diagnostic indicators. For 
instance, when rolling its egg back into the nest, the 
ereylag goose must complete the head-withdrawal pat- 
tern even if the egg slips out from under its beak 
during the act (Tinbergen, 1951). To the ethologist 
this proves that egg rolling is a fixed, built-in, instinc- 
tive action pattern rather than an outcome-determined 
operant. Likewise, the fixed response of a fish or a 
bird to a dummy model defines the existence of a sign 
stimulus and proves that the act it releases is an in- 
stinctive, rather than learned, pattern. As we have 
seen, such “errors” in adult mammalian behavior 
(called abnormalities) are often diagnostic of damage 
to the central nervous system. In a similar way, what 
seem today like paradoxes in learned behavior may 
turn out to be diagnostic indicators of the level of 
encephalization of that behavior. 


SUMMARY AND CONCLUSIONS: 
LEVELS OF OPERANT BEHAVIOR 


As discussed earlier, several phenomena in operant 
behavior (the “misbehavior” of organisms, negative 
automaintenance, etc.) may seem puzzling because we 
assume the operant to be an indivisible, emergent 
unit, with laws different from those governing the 
simpler subcomponents of behavior (reflex, instinctive 
fixed action pattern). In its idealized form, perhaps 
best expressed in human behavior, it is the very anti- 
thesis of a reflex. A reflex is an unconscious, unlearned, 
involuntary, built-in fixed response to a particular 
kind of stimulus. As idealized, the operant is a con- 
scious arbitrary act which, through learning, has be- 
come associated with an arbitrary stimulus and whose 
frequency is maintained by an arbitrary reinforce- 
ment. In principle, all stimuli, all operants, and all 
reinforcements are interchangeable because no one of 
them bears any biologically built-in connection to the 
others. 

In the laboratory, or in real life, when reinforce- 
ment is used to shape behavior, paradoxes may appear 
when one or more of these idealized assumptions 1s 
violated. As Skinner (1938) pointed out, one source of 
error may arise in the act chosen as the operant. Per- 
haps, especially in lower animals like pigeons and 
rats, the operants usually used to study the laws of 
reinforcement are not as arbitrary as we have thought. 
In autoshaping, orienting toward the key is more 
readily influenced by food reinforcement than is the 
act of pecking it (Wessells, 1974). Pecking for food 
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reward may be very closely tied physiologically to the 
act of eating and may not be very arbitrary at all 
(Moore, 1973). Pressing a bar for a rat may be an act 
borrowed from some other instinctive pattern, but by 
the power of reinforcement or through Pavlovian con- 
ditioning may be shaped up to act as the initial ap- 
proach segment of the chain leading to the consump- 
tion of food or water. A lighted key that signals food 
or water may become a sign stimulus releasing the 
consummatory act of pecking in a pigeon, thus yield- 
ing some of the phenomena of autoshaping (Jenkins 
& Moore, 1973). 

Even if a given act satisfies the criteria of inter- 
changeability that assure its arbitrariness at the be- 
ginning of training (in other words, if the act 1s so 
highly encephalized that it bears little resemblance to 
a reflex), it may become rapidly deencephalized under 
many circumstances: as the consummatory act 15 ap- 
proached; with routinization in overtraining (the 
“misbehavior” of organisms); or with the fatigue, 
frustration, conflict, and thwarting (lack of reinforce- 
ment) that are the inevitable accompaniments of any 
reinforcement schedule. If we assume that fatigue or 
conflict can inactivate the higher-level neural controls 
characterizing the appetitive components of an in- 
stinctive behavior chain, then the very same act may 
become less encephalized (more stimulus-bound, stereeo- 
typed, synergistic, and exaggerated in its intensity) as 
the behavior is constantly repeated. These are all 
attributes of release phenomena usually seen most 
clearly after brain damage. Displacement phenomena 
in instinctive behavior might be viewed as manifesta- 
tions of deencephalization. Indeed, components of in- 
stinctive patterns that are normally considered inap- 
propriate (i., that are normally suppressed during 
the behavior in question) may be released as displace- 
ment activities. As Falk (1971) has pointed out, many 
of the adjunctive behaviors seen in operant situations 
(psychogenic polydipsia, pica, etc.) resemble displace- 
ment activities. Similarly, during the frustration in- 
duced by thwarting on a food-reinforcement schedule, 
a pigeon or squirrel monkey may display release of 
attack normally elicited by painful external stimuli 
(Azrin, Hutchinson, & Hake, 1966; Hutchinson, Azrin, 
& Hunt, 1968). 

Many of these paradoxes that seem so perplexing 
arise from our view of the operant as an indivisible 
emergent unit whose laws differ from the simpler sub- 
components of behavior revealed by physiological and 
ethological analysis. It is possible that the operant 
undergoes many transformations, and we must devise 
methods to isolate its transition forms. We must 1n- 
crease the magnifying power of our behavioral micro- 
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scopes. The smooth curves in our cumulative records 
may no longer appear smooth if we speed up the 
motor of the recorder. As Schwartz and Williams 
(1972) have demonstrated, an autoshaped peck has a 
different duration from a peck that obeys the rein- 
forcement contingencies. Slow-motion photography in 
the operant setting may reveal further changes in the 
topography of the key peck during thwarting. We 
should use evolutionary taxonomy as a microscope to 
reveal differences in the character of operants that will 
eventually fit into Baconian tables of similarities and 
differences. In my opinion, these will correspond to 
developmental and phylogenetic levels of encephaliza- 
tion. 

Our search for general laws of learning is indecd 
premature, but not unwarraniféd. Tastaaversion learn- 
ing now scems bizarre and difficult to encompass in 
our previous conceptions of learning. Tlowever, it may 
illustrate that learned associations between stimuli 
and physiological states undergo transformations and 
become less arbitrary as they are tied to sensory and 
motor systems which develop earlier in ontogeny or 
recover differentially after brain damage. Perhaps, as 
in the grasp reflex, such sensory controls develop in a 
characteristic sequence. 

We have a long way to go in our study of motiva- 
tion, learning, and operant behavior. We must use all 
the techniques of real simplification: physiological 
(the study of brain damage), developmental (the study 
of maturing infants), and ethological (the taxonomic 
study of the similarities and differences seen in op- 
crants). It is not yet possible to use direct synthesis to 
assemble the real, simpler subcomponents of behavior 
to prove that eur analysis is valid. However, the study 
ef the parallel transformations in béhawior during ré- 
covery and devclopment can previde a real synthesis. 
The behavioral description of S-R correlations at 
each level of encephalization are as much in accord 
with thé approach of Sherrington as they are with 
Skinner, They fit Skinnér’s criteria tor 4 ecientifie ap- 
preach te behayior, yet enable us to pertéct our iifi- 
derstanding of the levels of integration of the operant. 
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INTRODUCTION 


Operant psychology started with a meal. Using eat- 
ing as a means of studying reflexes and reflex chain- 
ing, Skinner developed the apparatus, the conceptual 
framework, and the methodology of operant analysis 
(Skinner, 1930, 1931, 1932a, 1932b, 1935). 

The focus on eating stemmed from the search for a 
recurrent, lawful behavior for analysis. ‘The “orderly 
periodicity in... eating activity” reported in Rich- 
ter’s 1927 study of meal taking by the rat provided 
Skinner with such a phenomenon. Skinner sought to 
account “for the appearance or nonappearance of a 
given set of behavior at a given time.” Richter’s “sim- 
ple observation of whether a rat eats” was, “after all, 
only an all-or-none measure” (Skinner, 1930, 1932a). 
His analysis focused on meal frequency and duration 
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in a freely feeding animal. In consequence, therefore, 
Skinner devised a measure of the strength of feeding 
behavior based upon the rate of eating within a meal 
(Skinner, 1932a). The assumption underlying this strat- 
egy was that knowledge of the strength of the chain of 
reflexes within a meal would make it possible to pre- 
dict both the onset and termination of a meal and 
thus the pattern of meals. Skinner's interest in the 
strength of the feeding reflexes within a meal led to a 
shift from this continuous sampling of a nondeprived 
animal’s behavior to a sample gained in a short, con- 
strained experimental session in a food-deprived an- 
imal. With his parsimonious, Baconian devotion to 
the observable, Skinner sought the laws of eating 
solely in the relations between reflex probabilities and 
such operations as fasting and feeding which changed 
these probabilities (Skinner, 1931, 1932a). He eschewed 
any dependence upon hypothetical, neurological, or 
physiological structures or states. The behavior laws 
were considered to be self-sufficient. “There was no 
need to reduce them to or explain them by phenom- 
ena from some other domain. Today, this tradition, 
further refined, is continued by such investigators as 
Herrnstein (1970), Morse and Kelleher (1970), Pre- 
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mack (1959), and Timberlake and Allison (1974). 

To study feeding behavior an eatometer was de- 
vised! ‘[his apparatus consisted of a door which the 
rat had to push open in order to seize a single pellet 
from a food magazine. Following a period of time 
without food, the rats were placed within the ap- 
paratus and allowed to consume food until eating 
ceased. Each door opening was recorded on a cumula- 
tive recorder which produced a record of the rate of 
eating. Rate, then, was considered to be the measure 
of reflex strength (Skinner, 1932a). 

The behavior was so orderly that Skinner suc- 
cumbed to temptation and fitted a curve to the data 
[N = Kt"; where N is the amount (number of pellets) 
eaten, t the time in session, and n and K curve-fitting 
constants]. 

At this juncture Skinner, still under the influence 
of the Sherringtonian definition of the reflex, was in- 
terested in analyzing the rate of occurrence of the 
various members of the reflex chain. In particular, he 
was concerned with the influence of eating time (chew- 
ing and swallowing) on the refractory phase of the 
initial reflex in the chain (seizing) and sought to dis- 
cover if “the law expressed in the equation N = Kt" 
is independent of the particular reflex that initiates 
the eating behavior” (Skinner, 1932b). To deal with 
this question, Skinner introduced an “arbitrary initial 
member” to the reflex chain, the lever press. 


The food tray is accordingly replaced by a re- 
peating “problem box’? which delivers a pellet 
of food into an open trough each time a hori- 
zontal lever is pressed downward. 


Thus the Skinner box was born. The results obtained 
from the Skinner box showed that “the rate of change 
of the rate of eating is independent of the nature of 
the particular reflex with which eating behavior be- 
gins.” ‘The delight in this conclusion, however, was 
soon surpassed by the excitement generated by new 
discoveries concerning the arbitrary initial member 
and the subsequent interest in elaborating the con- 
cept of the operant. This diversion left the original 
problems unexplored. 

In this chapter we wish to focus on three points in 
Skinner’s original analysis of feeding. First, we shall 
consider whether Skinner’s use of the reflex as the 
unit of analysis was a felicitous choice. Second, we 
shall examine the “orderly periodicity’ of eating re- 
ported by Richter which stimulated Skinner’s analy- 
sis. And third, we shall explore whether operations 
other than fasting and feeding control the pattern of 
meal taking. 
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Eating as a Reflex 


The basic assumption of the operant analysis of 
behavior is that current consequences control future 
performance. An important, interesting, and very diffi- 
cult question concerns the specification of the units of 
behavior on which these consequences act. The char- 
acterization of these units can vary from simple 
muscular movements devoid of meaning (Guthrie, 
1935) to complex patterns of responses whose dynamic 
interaction is intrinsic and is shaped by their conse- 
quences (Kohler, 1929). As the first step in his analysis 
of this problem, Skinner chose the reflex as the func- 
tional unit. 

The reflex as a unit of analysis has a long history 
(Fearing, 1930; Skinner, 1931). In 1662, an era when 
physical science was in the first flush of its initial suc- 
cesses in devising mechanical models of the physical 
universe, Descartes introduced the concept of the re- 
flex as an attempt to explain animate motion with a 
mechanical model (Jaynes, 1970). He derived his in- 
spiration from the hydraulically actuated dolls in the 
gardens of St. Germain which executed intricate pat- 
terns of movement when “stimulated’’ by someone 
treading upon a concealed pedal (Fearing, 1930; 
Jaynes, 1970; Skinner, 1931). In Descartes’s view, an- 
imals were automatons. Only the voluntary activity of 
humans was excepted from this categorization. Be- 
havior could be exhaustively duplicated by sufficiently 
complex machines obeying only physical principles. 

The reflex has been accepted by physiologists and 
psychologists of all persuasions as an accurate charac- 
terization of at least some aspects of behavior. There 
are at least three major reasons for this attraction: 


1. Once Pavlov (1927) had demonstrated the condi- 
tionability of reflexes, it was relatively easy to con- 
ceive of reflexes as the building blocks for complex 
sequences of behavior (Guthrie, 1935; Hull, 1937; 
Skinner, 1938). These sequences could be assembled 
by an organism’s phylogenetic or ontogenetic inter- 
action with its environment (Skinner, 1966). In this 
view, order in behavior only reflects order in the 
environment, not structure in the organism. 


2. Mentalistic or nonphysical language can be avoided 
in the analysis of behavior by defining the stimulus 
component of the reflex as any arbitrary change in 
the environment and the response component as 
any arbitrary movement of the organism without 
reference to intention, purpose, or function. This 
stratagem precludes any biological or psychological 
terms at the level of the data language. The psy- 
chological meaning of a behavior sequence derives 
from such criteria as establishing “smooth curves 
for dynamic laws” (Skinner, 1938). Skinner cau- 
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tioned against the use of vernacular, biological, 
neurological, or mentalistic concepts for other than 
heuristic purposes in the search for useful classes of 
variables (Skinner, 1938). 


3. Finally, acceptance of the reflex requires little or no 
theoretical commitment to mental or physiological 
processes as behavioral substrates. In this search for 
functional laws devoid of causal explanations, Skin- 
ner followed Mach and Bridgman (Skinner, 1931). 


Historically, the major disadvantage of a reflex 
analysis 1s that description of behavior has, in fact, 
become circumstantial and complex. It has become 
necessary to elaborate complex concepts such as the 
“observing response” (Wyckoff, 1952) or “pure stimu- 
lus aétg”’ (Hull, 1930) to déal with phenomena such as 
attention or intention. 

Reflex-Dased descriptions of behavior are also 
plagued by the ultimately difhbeult problem ot the in- 
finite numbcr of movements and stimuli which might 
occur at any given cross section in time und place. 
Skinner's concept of the generic nature of the stimu- 
lus and response allowed him to circumvent the tedi- 
ous problem of botanizing stimuli and responses 
(Skinnér, 1935). ‘The stimulus class was defined by the 
experimental context, and the response class by its 
cffect on the environment rather than by its detailed 
topography. In fact, Skinner advocated watching the 
recorder rather than the animal. Conceptual utility 
was determined by the extent to which a given sct of 
environmental variables resulted in “simple” laws 
when 4 given category of behavior was used (Skinner, 
103%). 

Skinner's generic characterization of a respensc was 
a historically important stcp away from his original 
analysis of the ‘strengths’ of the specific reflexes 
(seizing. chewing, swallowing) involved with the in- 
pastian df f6ad (Skinner, 10324). His hiding that the 
Mrate of change of the rate of eating 1g independent of 
the nature of the particular reflex with which cating 
begins” (Skinner, 1982b) made possible his use as an 
arbitrary initial member, a specific cnvironmental 
event=that is, a bar press—to represent the whole 
chain of behavior. 

Richter (1927) had shown that a freely feeding rat 
exhibited periodic episodes of eating, defined as 
meals, which excluded other activities. His rats were 
not deprived in the usual sense, and the initiation or 
termination of a particular bout of eating was not 
predictable from the variables he analyzed. Rather, he 
was only able to characterize the number, duration, 
and distribution of these episodes of behavior. Skin- 
ner approached the problem of determining such an 
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episode of eating by introducing the operation of fast- 
ing, which insured initiation of eating. ‘That is, it 
placed initiation under the experimenter’s control and 
made the description of within-meal behavior the 
locus of analysis. The basic data of the within-meal 
analysis was the initial rate of ingestion and the rate 
of change of rate of ingestion within a session as 
functions of such operations as fasting and feeding. 
Since only a single episode of eating, defined by the 
experimental session, was observed, no data on the 
frequency, duration, and distribution of meals in time 
was obtained. 

Even after interest in eating per se had shifted to 
the effects of schedules on behavior, the same experi- 
mental paradigm was maintained: Animals were de- 
prived of food, and their behavior was measured dur- 
ing an experimental session. The analysis still focused 
én “reflex strenoth” as Skinner originally defined it 
(Skinner, 1935). Complex behavior was constructed 
from the simplest reflex components. The laws of 
combination are derived from the laws of reflex 
strength. Thus since in this experimental paradigm 
only a single meal, initiated and terminated by the 
experimenter, was observed, the analysis of the results 
obtained was of within-meal behavior. The question 
raised by employing this tactic is whether the same 
variables which predict within-meal behavior also 
predict between-meal behavior, and if so, are the func- 
tions the samc? We shall attempt to demonstrate that 
this analysis does not predict the pattern of feeding in 
the freely feeding, nondeprived animal and has led to 
the neglect of several important variables in the study 
of animal learning and motivation. 

Skinner followed in the footsteps of the early 
physicists in developing his research strategy. Having 
borrowed a mechanical model (i.e., the reflex) from 
physics for the study of behavior, it was also natural 
to borrow a methodology in the form of the refine- 
ment experiment. Early physics was hampered by 
peor teels and materials and a lack of coherent theory 
for identification of variables. Relations established 
between variables were subject to large amounts of 
error, and the program was one of successive refine- 
ment of the experiment to reduce input from “‘ex- 
traneous’’ sources in order to eliminate error and dis- 
cover the “true” law. The best experiment was one in 
which the effects produced by the variable(s) being 
studied were large relative to all other effects. Operant 
analysis has not strayed from this path (Sidman, 1960). 
To study an automaton, one needed to reduce ex- 
traneous stimuli and restrict the response possibilities 
so that one could find the law hidden in the variety 
of activities observed. Hence a highly inbred, docile 
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animal, limited in historical inputs (naive), was chosen 
as the object of experimentation. He was placed in a 
box isolated from the sight, sounds, and smells of his 
neighbors, where the only question was “to press or 
not to press.” It was a study of performance in soli- 
tary confinement. Different animals and different situ- 
ations were not required to discover basic laws (Skin- 
ner, 1938). The conviction was that laws derived using 
the reflex as the unit of analysis are universal. ‘That 
is, such laws are invariant across species, response 
classes, and reinforcers (see Skinner, 1966, for a quali- 
fication of this thesis). It should be emphasized that 
the experimenter exerts close control over the behav- 
ior that is exhibited within this paradigm. ‘The an- 
imal is (purposely) restrained from exhibiting its full 
repertoire of behavior. Although these procedures re- 
duce ‘extraneous’ sources of variation and limit the 
behavior displayed, they offer little opportunity for 
observing the kinds of “solutions” the animal might 
make to a similar problem occurring in his “natural,” 
noisy, and (to the experimenter) confusing environ- 
ment. Rather, it is assumed that knowledge of an 
animal’s evolutionary history, his classification, his 
current situation, his ecological niche, and his present 
habitat does not contribute in any fundamental way 
to the understanding of the principles of behavior 
acquisition and maintenance. 

It will be the argument of the present chapter that 
the meal, as originally defined by Richter (1927), is a 
better unit of analysis than the reflex for the study 
of hunger, because, in the original sense of Skinner 
(1938), it “gives smooth curves for dynamic laws.” 
This unit will best reveal its utility in an environ- 
ment in which the animal himself schedules the ini- 
tiation and termination of behavior. It seems possible 
that the discovery of the important variables in such 
an environment is most likely to solve Skinner’s 
original problem of accounting “for the appearance of 
a given act of behavior at a given time’ (Skinner, 


1930, 1932a—see above). 


Feeding and Fasting 


One of the great conceptual difficulties with a 
mechanical model of animate behavior is the provi- 
sion of a motive force and direction for the organism. 
The solution to this problem for the early mechanists 
was homeostasis (cf. Pavlov, 1927). The theory of 
homeostasis derived from the concept of equibrium 
systems developed in thermodynamics. It was orig- 
inally used to describe the dynamic constancy of the 
fluid matrix of the cells (Cannon, 1932). Its meaning 
has since been extended to include any steady state in 
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which some parameter is regulated around a priv- 
ileged value. Regulation of energy balance is inferred 
when average energy expenditure and intake are 
matched. This condition must be met by any or- 
ganism maintaining a constant size or fixed pattern 
of growth or senescence. In addition, regulation can 
be the result of many different mechanisms, both be- 
havioral and physiological (Yamamoto & Brobeck, 
1965). In the strict sense, homeostasis imples nega- 
tive feedback, specialized receptors, and moment-to- 
moment monitoring of energy balance. In the case of 
feeding, the process is assumed to consist of successive 
depletion and repletion phases (DeRuiter, 1967). De- 
pletion occurs as a result of metabolism. When energy 
stores are depleted below a threshold or critical value, 
feeding behavior (search, seizure, ingestion) preempts 
other ongoing activities. Ingestion leads to repletion, 
and, when an upper threshold of the energy stores or 
some surrogate of these stores is exceeded, ingestive 
behavior ceases. This depletion-repletion cycle differs 
from the more usual homeostatic systems since the 
item being controlled (e.g., food or water) is discon- 
tinuously present in most environments in contrast to 
an item such as oxygen which is continuously present 
(cf. Cannon, 1932). Thus feeding can only occur in 
episodes rather than continuously. It is important to 
note that in this model, feeding is initiated in re- 
sponse to a substantive deficit. The character of this 
deficit is still unspecified and is the locus of most 
speculation and current research activity in feeding. 
The depletion-repletion model has been invoked to 
explain a variety of motivated behaviors, including 
those for which there is no obvious biological sub- 
strate (e.g., curiosity). In fact, the view of necessity as 
the driving force of behavior is an important part of 
the conventional wisdom of Western civilization. 
Western laws and customs are based on the notion 
that a person or animal will only perform some act if 
some essential requirement is taken from him (depriva- 
tion) and given back (reinforcement) in small units 
contingent upon the individual performing the re- 
quired act. This historical fact may account for past 
failure to consider alternative models in analyzing 
motivation. 

The depletion-repletion model of motivated be- 
havior has generated two main problems yet to be re- 
solved: (1) What is the nature of the signals which are 
correlated with depletion and repletion? and (2) 
What mechanisms detect and interpret these signals? 
A variety of physiological processes have been pro- 
posed to serve as the signals, from stomach contrac- 
tions or dry mouths (Cannon, 1932) to circulating 
metabolites and/or electrolytes (cf. Fitzsimons, 1971; 
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Hoebel, 1971; LeMagnen, 1971; Mayer, 1955). The 
most popular “interpreter” of these hypothetical sig- 
nals has been the hypothalamus, with the two proc- 
esses, depletion and repletion, being represented in 
the lateral and ventromedial nuclei, respectively (cf. 
Hoebel, 1971). This simple, elegant, two-stage model 
has generated most of the research on feeding and 
most theories of food-based motivation, with the result 
that there has been little systematic research on feed- 
ing outside this framework or on animals other than 
rats. 

The pattern of this research has been to test the 
implications of various versions of the homeostatic- 
hypothalamic model. This pattern of research ex- 
emplifies explanation in terms of a conceptual nervous 
system decried by Skinner (1931). For example, home- 
ostatic models dominated research in the area of feed- 
ing long before any of the requisite physiological and 
neurolopical measurements were possible (e.¢., Hull, 
1943). Only recently have strong reservations about 
the underlying assumptions of this model been ex- 
pressed (e.g., Collier, Hirsch, & Hamlin, 1972; Falk, 
1971; Fitzsimons, 1971; Kissileff, 1975; Oatley, 1970). 

The Skinnerian analysis of hunger avoided these 
problems of homeostatic theorizing and can best be 
summarized in a quotation from The Behavior of 
Organisms (1938, pp: $42 f; see also Skinner, 1932a, 
1952b): 


In dealing with the kind of behavior that gives 
rise to the concept of hunger we arc concerned 
with the strength of a certain class of reflexes 
and the two principal operations that affect it=— 
fecding and fasting. 


‘Vhis proposition asserts the primacy of behavioral 
analysis. It is possikle that histery may show that more 
progress in the study of hunger would have been 
made had the laws of feeding behavior been investi- 
gated within the positivistic framework advocated by 
Skinner rather than within thésé of the évér-changing 
moddélé of thé céntial nervous system (see also Adolph, 
1947; Brody, 1945; Kleiber, 1961; Richter, 1927; 
Young, 1936), 


Meals 


Richter’s carly observations showed that rats dis- 
tributed their feeding in episodes which can be re- 
garded as discrete meals. Over a 24-hr period of free 
feeding, he observed 8-10 meals (Richter, 1927). The 
definition of the term free in these studies does not 
imply cost, quantity, or availability, but rather that 
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the animal determines the initiation and termination 
of a meal. ‘This becomes clear when the six parameters 
which exhaustively described meals are considered: 
frequency, duration, amount, rate, intermeal interval, 
and choice of items. The meal as a unit of analysis 
satisfies the criteria of reliability. When a minimum 
amount of ingestive activity (e.g., 10 sec in the feeder 
or 3 pellets consumed) is used to define meal initia- 
tion, and a period of time without ingestive activity 
(e.g., 10 min of no eating) defines meal termination, 
meals prove to be discrete events which are relatively 
insensitive to changes in the criteria (e.g., Baker, 1953; 
Hirsch, 1973; Kissileff, 1970; Levitsky, 1970; Pank- 
sepp, 1973; Richter, 1927; Thomas & Mayer, 1968; 
Wiepkema, 1968). It is clear that meals can be con- 
sistently measured. ‘The question of interest is whether 
systematic laws can be found using meals as the units 
of analysis. 

The current reyival of interest in meals stems from 
the hypothesis that meals reflect the momentary physi- 
ological state of the organism (cf. LeMagnen, 1971; 
Teitelbaum & Campbell, 1958; Thomas & Mayer, 
1968). An animal is presumed to initiate a meal fol- 
lowing a period of time without eating when the level 
of circulating metabolites, hormones, or reserves re- 
flects a critical level of depletion, Similarly, the meal 
1s terminated when ingestion effects some critical 
change in physiological condition. A logical implica- 
tion of the depletion-repletion model is the existence 
of significant correlations between size of meals and 
intermeal intervals. There are two possible correla- 
tions. The first is between the intermeal interval pre- 
céding a méal and the size of the meal. The second is 
between the size of the meal and the following inter- 
meal interval. In the first case, if the amount eaten in 
a mcal is a function of the degree of depletion, it 
should reflect time since the last meal. That is, the 
meal following a long period of no eating should be 
larger than one following a short intermeal interval. 
Hi, however, the time lapse between meals is too short 
for the depletion threshold to be exceeded, some 
other mechanism must instigate meals. Similarly in 
the second case, following a large meal, the intermeal 
interval should be longer than that following a small 
meal, reflecting the influence of a “satiety” mech- 
anism. 

A significant correlation between meal size and the 
premeal interval has not been demonstrated under 
free-feeding conditions (Baker, 1953; Balapura & 
Coscina, 1968; Booth, 1972; Hirsch, 1973; LeMagnen 
& Devos, 1970; LeMagnen & Tallon, 1966; Levitsky, 
1970; Levitsky & Collier, 1968; Snowden, 1969; Thomas 
& Mayer, 1968; Wiepkema, 1968; Zeigler, Green, & 
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Lehrer, 1971). It is important to note that this finding 
is in marked contrast to the results obtained when an 
animal undergoes substantial deprivation preceding a 
meal (Adolph, 1947; Bolles, 1967; Stellar & Hill, 
1952). Whenever weight loss exceeds 7-10% of ad lib 
weight, there is a linear relation between body weight 
loss and many different measures of performance (cf. 
Collier, 1969). The fact that meal size is a function of 
the deprivation interval when depletion exceeds a 
certain critical size reflected in body weight but not 
for the minor weight loss occurring between meals in 
freely feeding animals suggests that different proc- 
esses may be involved in initiating eating in these two 
cases. Studies of stomach and intestinal contents in 
freely feeding rats and guinea pigs which have average 
intermeal intervals of 2-4 hrs indicate that there is a 
continuous and relatively constant intestinal load, 
even though the stomach load fluctuates with meals. 
Thus any fluctuations in the input across the in- 
testinal lumen would be endogenous in origin (Collier, 
Hirsch, & Hamlin, 1972). It would seem that in environ- 
ments in which the commodity whose intake (e.g,, 
food or water) is being controlled is discontinuously 
present, animals have met the problem of maintain- 
ing a constant milieu interne by establishing a con- 
stant milieu externe in the gut. The gut acts as a 
reservoir which buffers the episodic pattern of intake. 
This is most obvious in the large ruminants. An adult 
dairy cow, for example, usually has approximately 60 
gallons of fluid in the rumen. The existence of these 
continuous intestinal loads suggests that meals in 
freely feeding animals might not be initiated by de- 
pletion, but rather by some endogenous process. The 
null hypothesis in this case would be that there is a 
base rate of meal initiation in the freely feeding an- 
imal generating a random sequence of meals which, 
on the average, result in an adequate intake. In- 
dividual meals occur independently of the state of 
the organism (Premack & Kintsch, 1970). Only the 
parameters of this distribution, not the individual 
events, would reflect regulatory control. 

On the other hand, a significant correlation be- 
tween meal size and the interval following a meal has 
been reported by LeMagnen (LeMagnen, 1971; Le- 
Magnen & Devos, 1970; LeMagnen & Tallon, 1966) 
and a number of other investigators (e.g., Balagura & 
Coscina, 1968; Booth, 1972; Levitsky, 1974; Snowden, 
1969; ‘Thomas & Mayer, 1968). Recent papers have 
raised statistical, methodological, and theoretical ob- 
jections to the validity of this correlation (Collier, 
Hirsch, & Hamlin, 1972; Hirsch & Collier, 1974a, 
1974b; Panksepp, 1973). For example, the pooling of 
subjects (LeMagnen, 1971) can produce a significant 
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correlation simply as a result of intersubject differ- 
ences in meal frequency and duration. Panksepp 
(1973), in an elaborate analysis, has shown other sta- 
tistical artifacts in computation of the correlation. 
There is, further, the suggestion that both diet com- 
position (Levitsky, 1974) and texture (Thomas & 
Mayer, 1968) may affect the size of the correlation. 
The failure to find either correlation consistently, the 
continuity of intestinal load, and the small weight loss 
in the intermeal interval all suggest that the opera- 
tions of fasting and feeding (Skinner, 1932a, 1938) are 
not the sole determinants of the appearance or non- 
appearance of eating in freely feeding animals. A new 
class of variables must be sought in order to discover 
lawful relations. 

Richter (1927) found that the daily pattern of 
meals was sensitive to a variety of environmental 
variables. The availability of alternative activities 
such as climbing on towers, running in wheels, and 
nesting in boxes substantially affected meal frequency 
and duration. Another such effect is the duirnal 
rhythm in eating. That is, a rat is most likely to be 
found eating in the dark phase of the light-dark cycle, 
irrespective of the time between meals or the size of 
the previous meal (Baker, 1953). ‘These observations, 
buttressed by recent laboratory results (Collier, Hirsch, 
& Hamlin, 1972; Hirsch, 1973; Hirsch & Collier, 1974a, 
1974b: Kanarek, in press: Levin & Levine, 1974: Mar- 
wine, 1974) and field studies (Bell, 1971; Estes, 1967a, 
1967b; Kruuk, 1972; Schaller, 1967, 1972), suggest that 
an analysis of the relations between the parameters of 
meals and environmental variables may be fruitful, 

One can speculate that species have evolved feed- 
ing patterns through the course of evolution that 
reflect their niche (Schoener, 1971). Animals have 
specialized in the exploitation of specific food sources, 
which vary in (1) availability, (2) nutritional quality, 
and (3) caloric density. If the energy budgeted for the 
procurement and ingestion of food represents an im- 
portant portion of the total energy budget, it seems 
improbable that animals could afford the risk of sub- 
stantial depletion before instigating feeding behavior. 
This might create a condition of insufficient energy 
for successful feeding activity. Further, it is a striking 
observation that animals living in undisturbed eco- 
logical systems appear to match their numbers to re- 
sources. Since resource matching implies that the be- 
havior of animals must “anticipate” their needs rather 
than respond to them, variables other than immediate 
physiological state must mediate the initiation and 
termination of feeding. Hungry or at least starving 
animals are very seldom observed in the wild except 
for reasons of illness, age, or social status (Wynne- 
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Edwards, 1962). This fact suggests that animals in un- 
disturbed habitats have developed behavior, both so- 
cial and nonsocial, which insures an adequate intake 
of food. For example, when the available food is in- 
sufficient, the dominance hierarchy in a flock of chick. 
éms insures that the dominant birds consume their 
usual rations, while the subdominants receive the 
remainder. The subdominants will starve to death in 
the presence Of food rather than challenge the domi- 
nant birds, here is no direct competition for food, 
enly fer status (Wynne-Fdwards, 1962), Similarly, 
when food is scasonally abundant, both birth ratc and 
consumption are constrained in such a fashion that 
the population density matehes the period of least 
availability of food (Wynne-Edwards, 1962). A final 
example can be drawn from ruminants. ‘These animals 
have a large sterage capacity in the rumen and a long 
transit time from ingestion to absorption. A consistent 
intake must be maintained to provide the raw ma: 
terials for fermentation. The end product only gradu- 
ally becomes available for resulatory information. It 
sééms unhkely that imtake can directly track the 
momentary metabolic state under these circumstances. 

The considerations discussed above suggest that it 
may be useful to return to Richter’s original meth- 
odology and study animals in environments in which 
they can exhibit the behavior which they have evolved 
to solve both the economics as well as the physiology 
of feeding. ‘his would require a return to the situa- 
tion in which the animal initiates and terminates 
feeding and in which the meal is the useful unit of 
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Fig. 1. Temporal sequence of the 
meals of two catS al several ratio 
requircments. ‘The width of pip 
shows the duration of the meal. 
(From Kanarek, 1975.) 


analysis. Such an analysis may well reveal new Classes 
éf variables and relationships controlling feeding. 


FREE FEEDING 


The environments in which free feeding has been 
studied have varied widely. One which has been ex- 
tensively used provides a caged laboratory animal 
with nutritionally complete food (as the experimenter 
perceives it) and water. Both commodities are con- 
tinuously available. The ad lib lines in Figure 1 show 
Sample 24-hr records of such feeding Behavior in two 
heusé cats (Kanarek, 1976). Lhe food was Purina 
Cat Chow available in a hopper attached to a large 
cage. The feeding episodes were distributed as discrete 
mcals of varying size. The majority of the meals were 
consumed in the dark. Approximately 9-10 meals per 
24 hr were taken. There were no consistently signifi- 
cant meal-intermeal correlations. The largest meals 
typically preceded and followed the intervention of 
the experimenter for purposes of weighing and daily 
maintenance. This can be clearly seen in Figure 1, 
where management occurred at 5 p.m. Similar effects 
of experimenter intervention on meal size have also 
been seen in rats and guinea pigs (Hirsch, 1973; Levit- 
Sky, 1970; Marwine, 1974). 

Although the characteristic number of meals taken 
within a 24-hr period varies widely among species and 
may be species-typical, the same general pattern of 
meal-taking behavior has been described in animals 


George Collier, Edward Hirsch, and Robin Kanarek 


ADULTS 


PURINA CHOW | CELLUFLOUR PURINA CHOW 
ION 


30 DILUTIO 


(op) 

= 

<I 

Lil 

= 10 20 r 30 40 50 60 70 
Lu. 24 HR 24HR 72HR 
O DEP. DEP DEP, 

ih 

1 YOUNG 

= 30 

&) PURINA CHOW CELLUFLOUR PURINA CHOW 
2 1 


DILUTION 
4 ne | 
wy 


20 | 


10 20 30 40 50 60 70 
24 HR 24HR 7a HR 
DEP DEP DEP. 
DAYS 


Fig. 2. Mean daily number of meals for young and adult guinea 
pigs. The darkened portion of the histogram shows the number 
of meals taken at night, and the unfilled portion shows the 
number of meais taken during the day. (From Hirsch, 1973.) 


as diverse as rats (e.g., Richter, 1927), mice (Wiep- 
kema, 1968), guinea pigs (Hirsch, 1973), gerbils 
(Kanarek, unpublished observations), pigeons (Ziegler, 
Green, & Lehrer, 1971), chickens (Duncan, Duncan, 
Hughes, & Wood-Gush, 1970), cats (Kanarek, 1975), 
dogs (Robinson & Adolph, 1943), and cockroaches 
(Faber, 1975). 

When food and water are readily available, meal 
frequency is the parameter of free feeding that re- 
mains invariant under a variety of experimental con- 
ditions. This constancy is seen clearly in a develop- 
mental study of free-feeding behavior in the guinea 
pig (Hirsch, 1973). Figure 2 shows meal frequency for 
two groups of five guinea pigs each. The young an- 
imals were 10 days of age at the start of the experi- 
ment, and the adults were 100 days old. Meal fre- 
quency did not change from 10 days to almost 6 
months of age. Growth-related changes in food intake 
were accomplished entirely by increases in meal size. 
The day-night (shown respectively by the unfilled and 
filled portions of the histogram) distribution of meals 
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also remained constant over this time period, with as 
many meals taken in the light as in the dark. 

The relative insensitivity of meal frequency to ex- 
perimental modification is also illustrated by the fact 
that water restriction did not influence the daily num- 
ber of meals. Adolph (1947) first pointed out the 
strong interrelation between food intake and water in- 
take. There is a voluntary reduction in food intake 
when water is given in a limited ration, resulting in a 
linear relation between the size of the water ration 
and food intake (Collier & Knarr, 1966; Collier & 
Levitsky, 1967). For our present purposes, the surpris- 
ing observation is that both rats (Levitsky, 1970; Mar- 
wine, 1974) and guinea pigs (Hirsch, 1973) initiate as 
many meals during periods of water restriction as 
they do when water is freely available. The reduction 
in food intake occurs because smaller meals are taken. 
These results are obtained when the water is given in 
a limited ration (Levitsky, 1970), for a limited amount 
of time (Hirsch, 1973), or if a reduction in intake 
occurs when animals are required to lever-press for 
their daily water allotment over the 24-hr period 
(Hirsch & Collier, 1974b; Marwine, 1974). ‘These find- 
ings are surprising in light of the close temporal rela- 
tion between feeding and drinking in animals fed 
and watered ad lib. 

Another example of the insensitivity of meal fre- 
quency to manipulation is seen following lesions that 
destroy the ventromedial region of the hypothalamus. 
The overeating that results from these lesions is due 
entirely to changes in meal size (Teitelbaum & Camp- 
bell, 1958; Thomas & Mayer, 1968), These observa- 
tions indicate that, under a variety of conditions, an- 
imals with free access to food modulate intake by 
adjustments in meal size rather than frequency. 

The preceding description is not meant to suggest 
that regulatory changes in feeding behavior cannot 
be mediated by adjustments in feeding frequency. 
Changes in meal frequency occur following olfactory 
bulbectomy (LaRue & LeMagnen, 1971) and lateral 
hypothalamic lesions (Kissileff, 1970), during chronic 
diabetes (Booth, 1972; Panksepp, 1973), and during 
intragastric nutrient infusion (Thomas & Mayer, 
1968). Further, those species which make compensa- 
tory changes in food intake in response to manipula- 
tions of caloric density of diet appear to accomplish 
these primarily by changes in meal frequency (Kan- 
arek, 1974; LeMagnen, 1971; Thomas & Mayer, 1968). 
For our present purposes, we simply wish to empha- 
size that (1) a wide range of intact animals with food 
and water readily available distribute their feeding 
behavior into distinct episodes which can be classified 
as meals; and (2) the daily frequency of the meals is 
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relatively insensitive to a number of experimental 
manipulations which produce large changes in daily 
food intake, the latter changes being accomplished 
solely by changes in rate and duration of feeding. 

Drinking can be analyzed in the same fashion as 
eating, but there is much less information available 
regarding the pattern of drinking that occurs under 
conditions of unlimited access to water, Siegel and 
Stuckey (1947) were the first to quantify Richter’s 
(1927) observation of a diurnal pattern in rats. They 
found that rats consume approximately 75% of their 
water at night. This diurnal drinking pattern has 
been found to persist during food deprivation (Fitz- 
simons & LeMagnen, 1969; Oatley, 1971), following 
bilateral nephrectomy when water intake is sharply 
reduced (Fitzsimons, 1969), and cven when the an= 
imal’s fluid requirements have been met by continu- 
ous intragastric water infusion (Fitzsimons, 1967), In 
the rat (fitzsimons & LeMagnen, 1969, Kissileff, 1969, 
Marwine, 1974), dog (Robinson & Adolph, 1943), 
guinea pig (Hirsch, 1973; Hirsch & Gollier, 1974b), 
and cat (Kanarek, 1975}, drinking occurs in clis- 
crete bouts with a strong temporal association be- 
tween feeding and drinking. In a carefully detailed 
analysis, Kissileff (1969) has shown that in the rat, 
bouts ef drinking are small, with 78% of them being 
between .5 and 2.5 ml. Approximately 75% of these 
bouts occur either 10 min prior to a meal, during the 
meal itself, oy 10 min after the meal (see also Mar- 
wine, 1974). 


AVAILABILITY 


Another possible environmeént if 6fé6 if Which 
food and water are not readily available and con- 
siderable time or effort is associated with obtaining 
food and water. One obvious example of this situation 
in the natural environment is the feeding pattern ex- 
hibited by carnivores which must pursue and capture 
reluctant prey varying in distribution and numbers 
(Schaller, 1972), A second example 1s provided by 
those herbivores which exploit vegetation dispersed 
widely over their pasturage (Bell, 1971; Westoby, 
1974). A final example is visits to the water hole. 
Animals which must make periodic pilgrimages of 
varying distances to a water hole are, as a result, sub- 
ject to varying degrees of predation and competition. 
The timing of the trip may be determined more by 
the factors of predation, competition, and effort than 
by water and electrolyte needs (MacFarlane & How- 
ard, 1972). 

When environments of varying availability are sim- 
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ulated in the laboratory by the use of operant tech- 
niques, several important theoretical questions arise: 


1. Will nondeprived animals tolerate demanding in- 
strumental response requirements to obtain their 
daily allotment of food or water in the free-feeding 
paradigm? That is, when food is delivered on 
schedules requiring large numbers of responses and 
long periods of time in which to execute them, 
must animals undergo “substantial” deprivation in 
order to initiate a meal? 

2. Is it necessary for the initial training of the re- 
sponse to take place under deprivation? This ques- 
tion speaks to the old controversy concerning the 
relation between habit and drive (Hull, 1943) and/ 
or the conditions for reinforcement (Skinner, 1938). 

3. Do animals undergo substantial weight loss be- 
tween méals when large response requirements are 
imposed; 

4. Will schedules and parameters of reinforcement 
exert typical effects given the modifications of the 
experimental environment conditions? 

5. What sccm to be the most appropriate units of 
analysis for assessing these conditions? 


A number of experiments were carried out in an 
attempt to answer these questions. The resulting data 
will be considered below. 


Meal Frequency as a Function 


of Ratio Size 


In the first experiment that explored an environ- 
ment resembling natural feeding conditions, rats were 
continudusly housed in the experimental chamber 
(Collier, Hirsch, & Hamlin, 1972). Both food and 
water were always available during the first stage of 
the experiment, and the pattern of feeding and drink- 
ing was monitored. After a stable base line of feeding 
and drinking was established, a lever was introduced 
and access to food was made contingent on complet- 
ing a fixed-ratio (FR) schedule of reinforcement. Food 
remained available after a reinlerced lever press for 
as long as the animal was eating and until 10 addi- 
tional consecutive minutes after feeding had termi- 
nated. This criterion for a meal is similar to that 
which is commonly used in other research of feeding 
(e.g., Kissileff, 1970; LeMagnen, 1971; ‘Thomas & 
Mayer, 1968). ‘This reinforcement paradigm differs 
from the classical one in which a fixed amount of 
food or a fixed time of access is made contingent upon 
the required behavior. In the present situation, the 
animal both initiated and terminated the meal. The 
following ascending series of FR requirements were 
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Fig. 3. Frequency of meals for three rats as a function of ratio 
size. (From Collier, Hirsch, & Hamlin, 1972.) 


used: 1, 5, 10, 20, 40, 80, 160, 320, 640, 1,280, 2,560, 
and 5,120. Each schedule remained in effect for 10 
days. No special shaping or training procedures were 
necessary. The animals were not deprived by any 
experimenter-controlled procedure, and at no time 
did their body weights deviate by more than 5-6% 
from their initial body weight. The data presented 
here were taken from the last five days under each 
condition. 

This set of requirements produced orderly changes 
in both instrumental and consummatory behavior. 
Figure 3 shows that these requirements led to a reduc- 
tion in meal frequency that was linearly related to 
log FR size. Under free-feeding conditions these 
animals had been eating 9-14 meals per day. This 
value decreased to 1 meal per day at FR 5,120. The 
performance of subject 2 began to deteriorate at FR 
390, and his ratio was not increased over 640. The 
reductions in meal frequency were associated with 
compensatory increases in meal size. ‘The latter re- 
sulted from an increase in the amount eaten during a 
meal rather than from changes in rate of eating (see 
Figure 4). Figure 5 shows that insignificant amounts 
of weight were lost under these conditions, even 
though there were small declines in daily food intake. 
The animals were run subsequently on a random 
sequence of these ratios, and the function was exactly 
recovered. There were no discernible practice or 
order effects. 

Several noteworthy features of instrumental per- 
formance were apparent when free feeding was con- 
strained in this manner. First, all animals acquired 
the lever-pressing response without shaping within 
hours after the contingency was imposed. At FR 1 all 
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Fig. 4. Meal size and rate of eating for three rats as a function 
of ratio size. (From Collier, Hirsch, & Hamlin, 1972.) 


bar-pressed in excess of the schedule requirement; by 
FR 5, however, the number of reinforcements was an 
accurate reflection of the number of meals consumed. 
The rate of responding was not systematically related 
to FR size. One animal showed ratio strain at FR 320 
with long pauses between bursts of responding. At FR 
5,120 a second animal began to show signs of ratio 
strain. The third animal never showed evidence of 
ratio strain. At the higher ratios the animals tended 
to respond at steady rates for periods of up to 2-3 hr. 
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Fig, 5, Body weight and food intake fox three rats as 4 function 
of ratio size. (From Collier, Hirsch, & Harmilin, 1972.) 


The most striking feature of these data is the size of 
thé ratios tolerated by these nondeprived animals. 
Ratios of this size have seldom been reported to sus- 
tain stable performance (Ferster & Skinner, 1957), and 
when they have, the data were obtained from deprived 
animals with conditioncd reinforcers built into the 
schedule (Findley, 1962: Findley & Brady, 1965). It 
should also be noted that in the Findley situation thé 
reinforcement was of a fixed size determined by the 
experimenter rather than by the animal. It is con- 
é&ivablé that thé limits 6m r4ati6 $176 in the present 
situation are net determined by the demands of the 
schedule of reinforcement, but by the inability of rats 
to process large volumes of food in short time periods. 
When food availability is restricted to an hour or less 
a day, rate do nét maintain ad lib levels df intake and 
lose weight (Ehrentreund, 1959; Lawrence k& Mason, 
1958), 

As previously stated, there was little tendency for a 
rat to interrupt a run of responding. This may be ex- 
plained by the fact that each schedule requirement 
has associatéd With it a cértain numbér SE initiations 
per day. For example, at FR 80, the rats typically ate 
six meals per day, Thus it would appear that the 
effect of a long pause in a run of responses would be 
similar to requiring another initiation of respond- 
ing and at any given ratio reduce the actual number 
of meals the animal would obtain. This consequence of 
pausing would tend to enforce uninterrupted runs of 
responses to the completion of the requirement. 

The reduced frequency of meals resulting from in- 
creased ratio requirements is associated with a com- 
pensatory increase in the size of the meals. Thus it 
appears that both meal initiation and termination are 
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under strong schedule control. This conclusion is 
strengthened by the fact that a significant correlation 
between meal size and the intermeal interval under 
these conditions has not been found (for contrary 
data, see Levitsky, 1974). 

There are large individual differences in the size 
of the ratio tolerated. It is possible that much higher 
ratios could be sustained if the meal definition were 
changed to allow pauses greater than 10 min. Such a 
procedure would allow the animal to extend a meal 
and slow ingestion rate without facing reinitiation. 
However, this variable remains to be explored. 

The effect of food availability (FR size) on fre- 
quency and duration of meals is not unique to rats. 
Adolescent cats (Kanarek, 1975) and young, grow- 
ino suinea pigs (Hirsch & Collier, 19744), who not 
only must maintain intake at the base line level but 
also must increase intake in order for growth to 
progress in a normal manner, respond to the regula- 
tory problems posed by constraints on food availabil- 
ity in remarkably similar ways. Figure 6 shows the 
growth curves of six male guinea pigs tested under 
these conditions. Each successive FR represents a 
block of four days. It is apparent that normal rates of 
growth, established by a control group of six animals 
maintained concurrently on noncontingent feeding, 
were maintained in all animals until at least FR 1,280. 
The guinea pig adopts a somewhat different strategy 
than the rat for conserving normal levels of food in- 
take. His increase in meal size results from increases 
in both meal duration and eating rate (see Figure 7). 
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ratio was in effect for four days. (From Hirsch & Collier, 1974a.) 
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Fig. 8. Frequency of meals for two cats as a function of ratio 
size. (From Kanarek, 1975.) 


The strategy of the cat is more similar to that of the 
rat: increases in meal size are accomplished solely by 
changes in meal duration rather than in rate of inges- 
tion. Particularly notable was the fact that one cat 
was able to sustain stable performance at FR 10,240. 
Both the reduction in meal frequency (Figure 8) and 
the increase in meal size (Figure 9) as a function of the 
ratio requirement were completely recoverable during 
a random sequence of FR values for one cat and dur- 
ing a descending sequence for the other. 

In these three species, the relationships between 
food availability (FR size), the parameters of feeding 
behavior, and instrumental performance are also ob- 
served when the nature of the operant is varied. In an 


Pa CAT | é 
® Oo 
50 e 
@ ASCENDING .e) 1°) 

= a0 O RANDOM fo) 
os 30 a 
to , 
= o 4% 

{ 
us gs ¢ @ g FS e 8 
<I 
Zz 60 
re 50 
rs @ ASCENDING 
u O DESCENDING fe) 

e 
30 e 
fe) 
20 @ 
10 © 
. 8 @ § © 
ADLIB {| 10 20 40 60 60 320 640 (280 2560 520 10240 


FIXED RATIO SIZE 


Fig. 9. Food intake per meal for two cats as a function of ratio 
size. (From Kanarek, 1975.) 
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attempt to manipulate the energy cost of a meal, 
Kanarek (1973, unpublished research) made access to 
the feeder contingent on wheel turns rather than 
lever presses. The same functional relations were ob- 
tained. Similarly, Levitsky (1974) found a decrease in 
meal frequency and an increase in meal size as a func- 
tion of the amount of time that rats were required to 
hold down a lever to gain access to a feeder when the 
rat controlled the initiation and the size of the meal. 
It should be noted that his data are reported and in- 
terpreted solely in terms of meal size and intermeal 
interval. 

Almost identical relations are observed in rats 
when water availability is restricted by increasing FR 
requirements. There 15 a monotonic decreasing réla- 
tionship between FR size and the number of bouts of 
drinking that secur over 4 24-hY period (Marwine., 
1974), ‘Ube paradigm used to study drinking is equiv- 
alent in all major respects to that previously described 
for the feeding environment. Following the comple- 
tion of the FR requirement, a drinking tube is in- 
serted into the cage. The tube remains available for 
as long as the animal is licking plus 5 min. Dhere 1g a 
regular decrease in bout frequency and an increase in 
bout size as a function of the ratio requirement. Both 
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descending sequence of FR values. These shifts in 
patterns of drinking are not completely successful in 
maintaining total water intake at control levels. At 
the higher ratios (FR 80 and above) there is a small 
decline in total intake, but this does not reduce intake 
below the obligatory requirement. ‘That is, food in- 
take and weight gain do not differ from controls. 
Two other observations illustrate the generality of 
the behavioral effects that are observed when the 
availability of a reinforcer is constrained. Rats readily 
learn to lever press on FR schedules (Collier & Harsch, 
1971) or to lick (Premack, Schaffer, & Hundt, 1964) for 
access to a running wheel. They will also lever-press 
to start a voluntary treadmill (Collier & Hirsh, 1971). 
When the duration of the running episodes is under 
the animal’s control, increases in the FR size lead to a 
decrease in the number of bouts of running (Figure 
10}, but a compensatory icrease in the amount of 
running per bout. Although these changes conserve 
total amount of running, they are insufficient at the 
highest ratios to maintain the level of running of the 
controls. Also, Adair and Wright (1973) have shown 
that behavioral thermoregulation is sensitive to the 
effort involved in controlling environmental tempera- 
ture. When the force required to pull a chain that 
changes environmental temperature was increased, 
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squirrel monkeys tolerated greater extremes in the 
amplitude of the air temperature. Not only did they 
allow a cold environment to become much colder at 
high force requirements, but also they tolerated a 
much warmer temperature before chain pulling was 
initiated. 


Rate of Ingestion 


Another version of a free-feeding environment is 
one in which the meal consists of discrete units—for 
example, nuts, berries, or pellets. The laboratory ver- 
sion is accomplished via the pellet dispenser in which 
the rate of delivery of pellets is usually under the 
animal’s control (Balagura & Coscino, 1968; Kissileff, 
1970; ‘Teitelbaum & Campbell, 1958). Under these 
circumstances, meals consist of a number of pellets. 
This is similar in some respects to the classic rein- 
forcement paradigm in which a deprived subject is 
provided small units of the appropriate commodity 
contingent upon a specified response sequence. The 
size of the unit (weight, volume, concentration, access 
time) is fixed. The rate at which the item is delivered 
to the subject is dependent upon rate or pattern of 
responding on some schedule (e.g., ratio) and is inde- 
pendent of response rate on others (e.g., interval). In 
our major departure from this traditional procedure, 
the animals live in the experimental space and obtain 
all their food or water over the 24-hr period by satis- 
fying the schedule requirements. On this 24-hr 
regimen the animal can initiate the required response 
sequence at any time and by successive repetitions of 
this sequence can control the amount consumed. As 
stated before, a meal consists of a series of discrete 
pellets, each obtained on an FR schedule. The end of 
a meal was defined by the passage of 10 min without the 
initiation of a ratio run. The rate of ingestion is con- 
strained by both the size of the reinforcer and the 
schedule on which it is delivered. 

Adult rats and growing guinea pigs were tested on 
an ascending series of FR schedules with reinforce- 
ment consisting of single 45-mg pellets (Collier, 
Hirsch, & Hamlin, 1972; Hirsch & Collier, 1974b). The 
sequence of FR sizes used was 1, 5, 10, 20 followed by 
increments of 20 up to FR 240. Each schedule re- 
mained in effect for 10 days. Figure 11 shows the 
changes in instrumental performance for two rats 
tested under these conditions. Rate of responding in- 
creased sharply as a function of FR size. The changes 
in momentary rate were actually larger than the figure 
indicates because the rate measure included the post- 
reinforcement pause, which also increased substan- 
tially with FR size. Although not measured directly, 
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Fig. 11. Number of responses, rate of responding, and number 
of reinforcements for two rats as a function of ratio size. (From 
Collier, Hirsch, & Hamlin, 1972.) 
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Fig. 12. Number of meals, meal size, meal duration and inter- 
meal intervals for two rats as a function of ratio size. (From 
Collier, Hirsch, & Hamlin, 1972.) 
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the increase in the postreinforcement pause was Vis- 
ually evident in the cumulative records (see Collier, 
Hirsch, & Hamlin, 1972). Response output, which 
stabilized between 60,000 and 70,000 responses per day 
at FR 160, was a monotonic increasing function of FR 
size. At asymptote, animals lever-pressed for almost 
14 hr per day. Despite the large increases in response 
output and rate of responding, the daily number of 
reinforcements was a decreasing function of FR size. 

Figure 12 shows concomitant changes in the param- 
eters of feeding behavior under the same conditions. 
At low ratios, meal frequency was somewhat higher 
than that normally observed when powdered chow is 
fed. Both animals showed a gradual reduction in 
feeding frequency from FR 1 to FR 160. That is, the 
increasing number of responses required to obtain a 
pellet extended the time required to obtain a fixed 
number of pellets but did not affect the number of 
pellets consumed per meal. The increase in meal size 
may be an artifact of meal duration, in that successive 
meals necessarily began to overlap since so much time 
was spent obtaining meals as the length of the meals 
increased. Meal duration was an increasing function 
of the ratio size, which reached asymptote at 60 min at 
FR 160, Meal size showed a small increase to FR 160 
and then decreased. The function relating the length 
of the intermeal interval to FR size was less regular 
but tended to increase up to FR 160 and then to de- 
crease, The changes in the parameters of free feeding 
were small and in many ways were secondary relative 
to changes in the level and rate of responding in con- 
serving the animal's rate of ingestion, pattern of feed- 
ing, and level of intake, Figure 13 shows that food in- 
take and kedy weight were maintained at a constant 
level until FR 60 under these conditions. The control 
animals were maintained in the same housing as the 
experimental group, but tw6é had noncontingent ac- 
68 to a dish of pellets and four were fed powdered 
Purina Chow. 

Grewing guinca pigs tested under very similar con- 
ditions showed the same gencral pattern of feeding 
and changes in instrumental performance (Hirsch & 
Collier. 1974). However, performance tended to 
deteriorate at somewhat lower ratio requirements. 

When small amounts of water rather than food 
were used as the reinforcer in the paradigm requiring 
animals to complete an FR requirement for a fixed- 
size reinforcer, rats (Marwine, 1974) and guinea pigs 
(Hirsch & Collier, 1974b) showed a similar profile of 
adjustment and maintained intake at the control 
levels until approximately FR 50. Under the latter 
condition, the daily number of drinking bouts stayed 
relatively constant, and there was a small reduction in 
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Fig. 15. Body weight and food intake for two vate ag a fune- 
tion of ratio size, (From Collier, Hirsch, & Hamlin, 1972. 


the size of these bouts at the higher ratios when total 
intake declined. The temporal association between 
feeding and drinking (Kissileff, 1969) was largely un- 
affected by this constraint on drinking behavior. In 
the free-feeding paradigm, when the rate of ingestion 
was constrained there were several noteworthy 
features of instrumental performance. Again, it was 
observed that nondeprived animals required no shap- 
ing. Also, nondeprived animals defended their total 
intake by meals of large response outputs at very high 
rates of responding, over very long (14-hr) periods of 
time. 


CALORIC REGULATION AND 
CHOICE OF DIETARY ITEMS 


Species differ widely in their ability to control 
caloric intake and to select a nutritionally adequate 
diet. ‘hese differences reflect both their ecological 
niche and position in the food chain. 
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Regulation 


Regulation of energy balance implies a matching 
of caloric intake to caloric expenditure. One common 
test of regulation is to present animals with diets of 
varying caloric density. Regulation is inferred when 
the animal adjusts volume intake in such a fashion 
that total caloric intake remains constant across diets. 
When the availability of the source of calories is con- 
currently varied such that meal frequency and size 
both vary, the test becomes more powerful. Using 
both availability and caloric concentrations as vari- 
ables, Kanarek (1974) tested the ability of rats and 
cats to maintain body weight by adjusting volume 
food intake. Availability was manipulated by the use 
of ratio schedule, and the reinforcer consisted of un- 
limited access to a tunnel feeder. In the rat, low- and 
high-caloric-density food increased and decreased, re- 
spectively, the frequency of feeding in such a fashion 
that caloric intake remained constant. These relations 
were obtained at each level of availability and were 
additive rathér than interactive. 

On the other hand, cats faced with the same preb- 
lem did not control total intake appropriately at any 
level of availability; rather, they appeared to eat solely 
for bulk. Guinea pigs similarly tested also tended to 
hold bulk intake constant, However, they surpassed 
the cats in their ability to vary the efficiency of food. 
utilization, and their weight did not fluctuate as 
widely when their caloric intake variéd to thé same 
degree (Hirsch, unpublished observation; Kanarck, 
1974). The above results show that the nutritive prop- 
erlies Of the reinforcer are important in determining 
beth the pattern of feeding and the ancillary instru- 
mental behavior. These species differences have im- 
portant implications for an analysis of reinforcement 
which will be considered subsequently. 


Self-Selection 


Faced with a variety of dietary items, some animals 
are able to select a nutritionally adequate diet. Rats 
are well known for this ability (Lat, 1967). An experi- 
mental paradigm that permits study of the strength of 
the tendency of an animal to balance its diet has been 
developed by Hirsch (unpublished data). In this pro- 
cedure, one component of the diet (e.g., a carbohy- 
drate source such as sucrose) is presented freely while 
a second component (e.g., a protein source such as 
Purina Chow) is presented contingent upon an oper- 
ant. ‘The effort requirement can be varied and its 
effect on the intake of the two items explored. In a 
first experiment, rats pressing for Noyes pellets (ap- 
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proximately 23% protein) and offered ‘“‘free’” sucrose 
made as many as 50,000 responses per day maintaining 
at least a minimal (6%) protein intake at the highest 
ratio. Similarly, when the protein-containing com- 
ponent of the diet was free and the rat worked for 
carbohydrate (sucrose), a large number of responses 
were expended to maintain the protein/carbohydrate 
ratio (Collier, unpublished data). Thus under free- 
feeding conditions, some species will expend a large 
number of responses to maintain their dietary bal- 
ancé, 


OTHER ENVIRONMENTAL CONSTRAINTS 


In the twe praseding sectiens, we have shawn that 
constraints on two paramctcrs of meals, frequency and 
Fate SF insastion, lead té esti pénsatory changes in 
ether parameters such that total intake 15 conserved, 
Constraints on the distribution of meals have similar 
éffects. For example, many animals show diurnal 
cycles of feeding. These cycles have been the inspira- 
tion for the study of a wide variety of biological 
rhythms (Richter, 1927). Mast recently, feeding cycles 
have been hypethesized te reflect changes in the basic 
mctabolic processcs involved in the control of food 
intake. For example. LeMagnen and co-workers (Le- 
, Gaudilheré, Louis-Sylvéstre, & “Lal- 
lon, 1973) hays argued that rats cat in excess of their 
requircments in the dark and store cnergy (i.c., are 
lipogenic), whereas in the light they eat less than their 
raquiraménts and césnsumé enersy (1-2. aré lipolytic). 
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regime pressing a lever fer pellets and found that con- 
straining the rate of ingestion by introducing a ratio 
requirement profoundly affected the temporal distri- 
bution of free-feeding behavior. Under eonditions in 
which thé ratio réquiremeént was the same in both 
parts of the 12-hr hght-dark cycle, the rats ingested 
appronimatcly 70% of their feed in the dark phase, 
the commonly reported valuc. However, this pattern 
of cating could be completely reversed by program- 
ming food to be available on a multiple schedule with 
continuous reinforcement during the 12-hy light pe- 
riod atid various FR sizes during the dark. The 
larger the FR, the greater the shift of eating to the 
light component. It appears that, given a choice, rats 
prefer to obtain food in the most “efficient” manner 
available, even if this requires altering their customary 
diurnal patterns. Boice (1972) had previously reported 
similar data for water intake. Finally, increasing the 
palatability of the diet in the light phase also effected 
a shift in the diurnal distribution of intake (Panksepp, 
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1973; Panksepp and Krost, 1975). Thus the under- 
lying physiological substrate such as that demon- 
strated by LeMagnen et al. (1973) can be easily 
disassociated from the pattern of meal taking by 
environmental constraints. 


RESPONSE STRENGTH 


One of the assumptions of Skinner’s original anal- 
ysis was that rate of responding is the measure of the 
strength of the response components in the consum- 
matory chain, 


Response Rate 


One interesting finding of the present studies is the 
differential sensilivily of both AVErAgEe and local rates 
of responding to the two experimental paradigms. 
With unrestricted access to the food magazine, no in- 
crease in the average rate of responding during a ratio 
run was observed in cats or rats, and only a small in- 
crease Was seen in the guinea pig. This finding con- 
trasts with the dramatic increase in response in the 
mere typical “within-meal” (i.¢., between-pellet) pro- 
cedure in which each ratio run is followed by a single 
pellet or 4 small drop of water. The difterential effects 
of FR size on rate of responding can be understood if 
one considers the wéility of rate increases in the two 
cases. An increase in the rate of responding when the 
duration of access to food is not constrained has no 
élféct on food availability or rate of ingestion. ‘The 
only contingency between rate and the parameters of 
meal taking is that increases in rate will shorten the 
time from the initiation of a ratio run to the delivery 
of food. This contingency appears to exert some con- 
trol bn the puinéa pigs which éats faster when access is 
constrained (cf. Hirsch & Collier, 1974a, 1974b). On 
the other hand, in the pellet or drop paradigm when 
only single pellets of food or drops of water are avail- 
able, a long FR chain between pellets or drops de- 
creases the rate of ingestion. In this instance, increases 
in the rate of responding serve to counteract the con- 
straint on rate of ingestion. This contingency exerts 
strong control in the two species tested. ‘Thus we have 
two different situations. In one, changes in ratio size 
do not affect rate of instrumental responding; in the 
other, they do. These appear to result from the effect 
of the schedule on the rate of ingestion. This differ- 
ence might be dramatized if the two paradigms were 
combined in such a fashion that the subject would be 
required to complete a ratio to gain access to a meal. 
The meal would consist of pellets delivered on a 
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second ratio requirement which would remain in 
effect until 10 min of no responding had passed. In 
this situation, each ratio would be expected to have 
different effects on the meal parameters. ‘The size of 
the first ratio would influence the frequency of initiat- 
ing meals and the size of the meals and would not 
affect the rate of responding between pellets. 

Preliminary results confirmed these expectations. 
Thus there appears to be a direct relationship be- 
tween the rate of responding and the utility of a 
given rate in the food economy. This suggests that the 
classical notion of rate as a measure of response 
strength only obtains under carefully circumscribed 
conditions. 


Magnitude of Reinforcement 


The rate analysis can be explored further by con- 
sidering “magnitude of reinforcement.” In the con- 
ventional account, increasing the concentration, qual- 
ity, amount, or time of access to reinforcement 
increases the rate (e.g., Collier, 1962; Collier & Siskel, 
1959; Collier & Willis, 1961; Guttman, 1953) or 
reciprocal latency (e.g., Bolles, 1967) of responding. 
In the 24-hr procedure in which the rat earns his total 
intake by bar-pressing for pellets and in which there 
are no constraints on the initiation and termination 
of meals, the prediction is exactly the opposite. For 
example, an adult rat will eat about 500 45-mg stan- 
dard Noyes pellets per day. A ratio requirement of 
200 bar presses per pellet would require an output of 
100,000 presses per day to maintain this intake. 
Neither the rat nor the guinea pig defends its intake 
completely, but each comes close: the rat makes on the 
order of 70,000 presses per day at this ratio, the guinea 
pig somewhat less. If the pellet size is increased to 90 
mg, it would require one-half the responses to main- 
tain intake, and if it were reduced to 22 mg, it would 
require double the number of responses. It is not sur- 
prising that this actually occurs. It is surprising, how- 
ever, that the average between-pellet rate of respond- 
ing (including the postreinforcement pause) is higher 
for small pellets and lower for large pellets. This is 
contrary to usual magnitude-of-reinforcement predic- 
tions. Although the local rate in this situation has not 
been analyzed in detail, preliminary evidence suggests 
that even local rate is higher for the smaller pellet 
and lower for the larger. This negative relation be- 
tween pellet size and rate appears to be the case only 
for nondeprived animals whose total intake is gained 
in the experimental situation. When the animal is 
depleted (1.e., greater than 7-10% body weight loss), 
there is a positive relation between the parameters of 
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reinforcement (e.g., volume, concentration, duration) 
and rate of responding (Collier & Meyers, 1961; Col- 
lier & Willis, 1961). Here again we see an important 
difference between the freely feeding animal and the 
depleted one. 


Ratio Tolerance 


In the situation in which access to the feeder is un- 
restricted until the meal is completed, it can be asked 
whether the tolerance of very large ratios is simply a 
matter of the size of the reinforcement taken follow- 
ing the response chain, or whether this tolerance em- 
bodies other principles. ‘The most attractive conjecture 
is that the subject’s termination of the meal (run- 
ning burst, etc.) is the critical event, That is, consider 
a meal to be not simply a collection of reflexes waxing 
and waning in strength, but rather a dynamic unit, 
the initiation and termination of which reflect cen- 
tral processes under the control of the animal’s total 
economy. From this perspective, then, the completion 
of the meal may control response strength, rather 
than the summation of strengths of the individual re- 
sponse units which make up the meal. It is quite clear, 
for example, that those features of the environment 
which cause an animal to eat fewer meals also cause 
him to eat longer meals. Total intake is conserved. 
Thus at the same body weight, eating the same food 
item, meal length covaries with frequency of initia- 
tion. How does the animal measure the amount con- 
sumed? It is unlikely, particularly when frequency 
has only been partially reduced, that the stomach will 
be emptied or the reserves depleted to such a degree 
that they can act as signals for extending a meal. The 
likelihood of immediate feedback in the free-freeding 
situation seems low enough to encourage speculation 
about a guidance system which anticipates rather than 
responds to needs (cf. Oatley, 1970). 


Unit of Analysis 


One last argument that can be made with respect to 
response strength derives from the question of the 
appropriate unit of analysis in feeding behavior. From 
studies in which the “effort” to obtain a meal is 
varied, it is clear the relations between frequency, 
duration, amount, and rate of eating covary in such a 
fashion that total intake over a given 24-hr period is 
conserved. ‘he implication of these facts is that Skin- 
ner’s focus on the dynamics of the within-meal reflex 
chain as the way in which to study total intake may 
have been misdirected. He argued that all-or-none 
events such as meals were most appropriately analyzed 
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in terms of the within-meal strength of the reflex 
chain involved in ingestion. The frequency of occur- 
rence and size of meals should be predictable from a 
knowledge of the “‘strength” of the components. How- 
ever, it would appear from the present data that total 
intake over the feeding cycle is the important bound- 
ary condition. Feeding behavior is path-independent 
in the sense that the final state, total intake, does not 
depend on the particular responses and their rate of 
occurrence within a meal. Thus one would not expect 
correlations among the more typical measures of re- 
sponse strength such as rate, extinction, or ratio toler- 
ance except under particular circumstances. “The 
pathandependence argument is similar to Skinner’s 
(1935) argument for the generic nature of the reflex in 
which the class is defined by its effect rather than by 
the order and topography of its members. If we ac- 
cept the path-ndependence argument, we must con- 
cur with Richter’s (1927) original conjecture that the 
meal is the appropriate unit of analysis of feeding 


behavier. 


FOOD ECONOMY 


Psychologists have long paid lip service to the 
theory of evolution, but they have seldom paid atten- 
tion to its consequences for psychological theory ex- 
cept in the most global terms. Most have accepted the 
notion that the morphology of a species reflects its 
adaptation to a particular ecological niche, but until 
recently few have accepted the notion that particular 
patterns of behavior similarly represent such adapta- 
tions (Barash, 1974), 

Historically, the most important variables in the 
study of feeding related to depletion and repletion ot 
S6mé nutritional itém. One had only to find the sub- 
strate which the hypethalamus was monitoring and 
onc could predict the onset and offset of feeding be: 
havior from knowing its current state. Aside from 
slight differences in equipment, all species would be- 
have similarly—that ig, obéy the same laws. Behavior 
successful in meeting nutritional needs was thereby 
strengthened; unsuccessful behavior was thereby weak- 
ened. 

Our suspicion that animals were economists as well 
as physiologists led to the research discussed in this 
chapter. It is obvious that feeding patterns (i.e., item 
choice and the frequency, duration, rate, and distribu- 
tion of meals) differ widely from species to species. Of 
greater interest and import, however, are the ques- 
tions of how these different patterns relate to the 
ecological niche in which the animal naturally oper- 
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ates and of how they are reflected in general problems 
of behavior. One approach to these problems is taken 
by the ecologists. For example, quoting from Schoener 


(1971, p. 369): 


Natural history is replete with observations on 
feeding, yet only recently have investigators be- 
gun to treat feeding as a device whose perfor- 
mance—as measured by net energy yield /feeding 
time or some other units assumed commensurate 
with fitness—may be maximized by natural selec- 
tion. . . . ‘The primary task of a theory of feed- 
ing strategies is to specify for a given animal that 
complex of behavior and morphology best suited 
to gather food energy in a given environment, 


Schoener views the problem as one of optimization 
which can be further “trisected” into tasks of: (1) 
choosing a currency, (2) choosing the appropriate cost- 
benefit function, and (3) solving for the optimum, For 
our present discussion, the most important idea in this 
view Is that an animal does not approach the problem 
of feeding naively; rather, he brings a strategy to the 
problem—a strategy which psychologists typically do 
not see because of the restrictions of their experi- 
mental situations. ‘Ihe usual laboratory setting is 
specifically arranged to minimize or prevent the dis- 
play of any pattern of behavior other than that which 
the experimenter is measuring. 

In an animal’s natural environment, both the 
initiation and termination of feeding are under the 
control of the animal. ““When,” “how often,” and ‘“‘the 
size and type of item” are the basic parameters of any 
feeding strategy which have to be adjusted to the 
density (i.e., probability of encounter) of the item, its 
caloric and nutritional content, and energy or time 
cost of obtaining the item. The basic problem for the 
animal is to partition his time and energy between 
the many different activities which insure his reproduc- 
tive success. An animal which did not consider the 
density, cost, and character of the 1tem to be procured 
but waited to initiate feeding when it was “hungry” 
in the physiological sense would simply be unlikely to 
survive in any but the most permissive environment. 
It seems more likely that animals must, in some sense, 
feed in anticipation of, rather than in response to, 
their needs. Obvious examples of such anticipating 
behavior are premigration hyperphagia (Odum, 1960) 
and hibernation (Mrosovsky, 1971). To investigate 
these speculations, it is necessary to introduce ecolog- 
ical variables into the laboratory—that is, to construct 
analogues of niches in laboratory situations. We view 
the present research as such an attempt. 
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Some insight into the role of these biological vari- 
ables can be gained by examining a class of phenom- 
ena not typically considered by psychologists. From 
the work of ecologists, it is clear that there are many 
environmentally defined events which control feeding. 
If one examines the feeding pyramid, there is a chain 
of exploitation of resources. Animals can be grouped 
into three major categories: herbivores, carnivores, 
and omnivores. Herbivores harvest the readily avail- 
able plant energy, but they pay for this in terms of 
low nutritional quality and low caloric density of 
their food. Their foods require intensive mechanical 
degradation (e.g., rumination) and chemical conver- 
sion (fermentation) before digestion to produce a 
nutritionally adequate diet. Because the varied in- 
gestants of the generalized herbivore (Westoby, 1974) 
are modified by fermenting into a nutritionally ade- 
quate diet, it is unnecessary for him, except in the case 
of certain minerals, to eat a highly selected diet. At 
the other extreme, the carnivores exploit a niche in 
which the food is of the highest quality and caloric 
density but is not readily available. The carnivore’s 
problem is one of procurement. Bulk intake alone 
can accurately regulate both calories and nutritional 
quality. Finally, omnivores, by being opportunists, can 
occupy a wide variety of habitats. This poses quite 
different problems for these animals. The variability 
in both caloric density and nutritional quality of the 
omnivore diet requires the omnivore to adjust his in- 
take to his needs by choice of item and amount con- 
sumed on a day-to-day basis (Schoener, 1971; Westoby, 
1974). | 

The extensive use of the rat in the study of Leed- 
ing and nutrition has blinded many réséarchérs to this 
variety of feeding patterns and strategies. As a result, 
the physiological models on feeding based on the be- 
havior of the rat are limited to the omnivore model 
and a very restricted expérimental procedure. 

In addition to considerations of the environmental 
niche, it is clear that social, seasonal, and climatie 
factors have important impacts on feeding schedules 
(Westoby, 1974). Feeding strategies are intensively 
conditioned by environmental considerations (cf. 
Schoener, 1971). Operant psychologists, by their im- 
plicit acceptance of the rat feeding model, have failed 
to consider such ecological variables in their analyses 
of behavior. Attention must be paid to the structure 
of the environment and its interaction with response 
patterns. ‘hese factors may specify the units of be- 
havior upon which reinforcement operates. 

Considering only the variables arising from an 
animal’s niche, we can make a very preliminary anal- 
ysis of their role in feeding patterns. The FR con- 
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straint on meal frequency can be viewed as a labora- 
tory analogue of a niche in which food varies in 
availability. “The usual situation of the carnivore 
exemplifies this niche (Estes, 1967a, 1967b; Kruuk, 
1972; Schaller, 1967, 1972). ‘The most typical circum- 
stance is one of low availability of prey, Carnivores 
typically, but not always, have a pattern of infrequent 
large meals. Since it appears, however, that the carniv- 
orous mode of feeding is in the repertoire of many 
species (Collier, Hirsch, & Hamlin, 1972; Hirsch & 
Collier, 1974a, 1974b; Kanarek, in press; Westoby, 
1974), an important generalization might be made re- 
garding the effect of food availability on ingestive 
behavior. 


Law of Availability 


AS commodities (e.g., food, exercise, heat) become 
less readily obtainable, the frequency of initiating be- 
havior which procures them decreases, but the amount 
taken per occasion increases in such a fashion that the 
total amount consumed is conserved. Thus the flaw of 
availability appears to relate to the animal’s efficiency 
of allocation of resources. For example, as food be- 
comes less readily available, the amount of time, 
cflort, and/or energy expended to obtain it increases. 
AS a result, it becomes more efficient to expend this 
amount of time, effort, or energy less often and to take 
larger amounts on any given occasion. On the other 
hand, when the commodity is readily available, fre- 
quency of initiation increases. This suggests that there 
is a “tradeoff” between the cost of procurement and 
the cost of use, the esst of use being related, for 
example, to such variables as the sesi ef ingesiien or 
absorption and their effect on the efficiency of utiliza- 
tion. The feeding behavior of large carnivores, euch 
as the lion, provides 4 classie example. Thasé animals 
must expend considerable effort and undergo high 
risks of injury in procuring their usual game. Ag 4 re. 
sult they live on 4 “feast-or famine” régime (Schaller, 
1972) which fluctuates with the density and size of prey. 
Large and/or scarce prey lead to infrequent large 
meals, whereas smal] and/or numerous prey lead to 
frequent small meals (Schaller, 1972), It would scem 
that the processing cost of large infrequent meals is 
higher than the cost of small frequent meals (cf. 
Morrison, 1968), such that animals revert to small fre- 
quent meals when possible. 

The law of availability is parallel to Schoener’s 
(1971) principle that the ratio of energy yield/time 
expended is maximized. Both principles raise, again, 
the difficult question of the appropriate dimensions 
for their terms. For example, what is the measure of 
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the cost of “search time’’? Is it the time, or the energy 
expended per unit of time, or the amount of other 
behavior excluded during the search, or all of these? 
Similarly, what are the dimensions of availability (cf. 
Westoby, 1974)? Are they dispersion, density, diffi- 
culty, effort required, or danger? 

In any case, it is clear that this law cannot be ob- 
served in the usual experimental situation, since both 
initiation and termination of meals are constrained. 
In fact, the usual situation in which a deprived animal 
works for reinforcers of fixed size on a schedule for a 
fixed session length amounts to a single meal. What is 
being studied is the effect of various experimental 
variables on the course of a single mea!. This is Skin- 
ner’s original paradigm. ‘The failure of this paradiom 
te generate the laws which govern an animal's usual 
pattern of feeding has led to a questioning of the 
Assit pliohs ot which it 1s based, In the more 'nat- 
ural” situation, irrespective of the scarcity or abun- 
dance of resources and the circumstances of their 
availability, the animal initiates and terminates meals 
and centrels the amount of the commodity consumcd. 
It is the circumstances of availability which appear to 
ba erueial. 


Sehadulas 


At the present time a set of unrelated principles 
and Jaws specifies what is known about schedules of 
reintorcement, Attempts to organize these principles 
into a coherent whole and reduce them to the deduc- 
tive consequences of a few axioms have mainly been 
directed toward probabilistic relations defining rein- 
forcement density (Schoenfeld, 1970). However, sched- 
ules of reinforcement may also have biological sig- 
nificance, and ¢xaminatien of schedules in light of 
biological variables may yield the organization investi- 
gators have been seeking. For example, one aspect of 
the carnivore’s feeding which a schédule in the labora- 
tory may possibly simulate is stalking, Characteristic 
ef seme hunting patterns are long periods of waiting. 
Schaller (1972) documented this point in a description 
of hunting lionesses: 


What impressed me most was the patience and 
incredible fussiness they displayed. On a num- 
ber of occasions I have sat for more than an hour 
watching a lioness or a young lion waiting for 
a herd of gnus or zebras, already close, to come 
closer. Once, two lionesses let a file of gnus and 
zebras trek within fifty yards of where they 
crouched, in plain sight but unnoticed. Several 
times one or the other tensed for a spring when 


THE OPERANT REVISITED 


a member of the file came slightly closer than 
the rest. But still they waited, until a zebra 
spotted them and snorted an alarm. 


Schaller did not label this as a contingency differential 
reinforcement, at low rates of responding (DRL), but 
it is difficult not to. If the lioness were to move too 
soon or be seen, the entire behavior sequence would 
have to be reinitiated. This aspect of carnivore be- 
havior could be simulated in the laboratory by im- 
posing a DRL contingency and allowing the animal 
to control the size of the meal that became available 
when the DRL contingency was satisfied. The DRL 
contingency would functionally reduce itself to con- 
tinuous reinforcement (CRF) if the interval require: 
ment did not exceed the animal’s typical intermeal 
interval length. For the rat, this time requirement 
would be approximately 90-120 min. Based on DRL 
performance in short test sessions, one would be 
forced to predict that this contingency exceeds the 
rat’s Capacity for temporal discrimination. However, 
behavior of this type is observed routinely on the 
part of carnivores that rely on stealth and short 
bursts of speed to capture prey. Such an experiment 
(DRL 2 hr or more) has not yet been conducted, but 
we would predict this type of performance to be in 
the repertoire of some animals. If so, increases in DRL 
length should lead to compensatory increases in meal 
size. Thus an analysis of the function of schedules in an 
animal’s economics may provide some order to sched- 
ule effects. ‘These accounts may not be unique, as the 
two different effects of FR in the meal and pellet 
paradigms reported above show, and may be best 
understood from the perspective of the niche in which 
the schedules are employed. Thus two schedules iden- 
tical in terms of temporal and numerical structure 
may be functionally different depending upon the 
environment in which they occur. 


Dietary Selection and Regulation 


Niche variables can also be used to predict “nutri- 
tive” behavior. As a result of their particular diet and 
their mechanics and chemistry of digestion, herbivores 
must consume large quantities of food at relatively 
low rates. Thus one would predict not only a high 
meal frequency, as previously pointed out, but also a 
relative insensitivity to the nutritional quality or 
caloric density of their food. Studies using guinea 
pigs, a monogastric herbivore, have supported these 
conjectures (cf, Hirsch, 1973), Studies on the ability 
of herbivores to self-select a balanced diet are not 
known to the authors. 
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Cats likewise appear to be bulk eaters and rela- 
tively insensitive to the caloric content of the diet 
(Hirsch, unpublished data; Kanarek, 1974). Their 
ability to select a nutritionally adequate diet also re- 
mains unexplored. Like their wild counterparts, 
domestic cats can go for long periods without food 
and then consume an amount sufficient to maintain 
erowth or adult body weight. Rats, on the other hand, 
are well known for their ability to regulate caloric 
intake and select an adequate diet. 

Thus it seems that exogenous variables play a 
dominant role both in feeding and in determining the 
effect of schedules on response patterns. Further, it 
appears possible to translate the parameters of feeding 
in the wild into operant laboratory procedures, 


CONCLUSIONS 


At the outset of this chapter, we considered two 
questions raised by Skinner’s original formulation of 
an analysis of feeding behavior. The first concerned 
the definition of a unit of analysis: Is the reflex, an 
arbitrary unit which acquires its behavioral meaning 
by its appearance in “simple” laws, the most appro- 
priate unit with which to study feeding behavior, or 
would larger units such as the meal or the feeding 
cycle lead to a more general principle which might be 
translatable within a natural context? Data presented 
in this chapter showed that, using the meal as the 
basic unit of analysis, such laws can be stated, and sug- 
gested exploration of a different class of variables 
which reflect the structure of the animal’s environ- 
ment. 

The second question concerned the model of moti- 
vation (reinforcement) which would be most useful in 
studying feeding behavior. ‘he classic model of feed- 
ing is one in which behavior is generated, shaped, and 
maintained in response to physiological needs (defi- 
cits). This model has led to the search for underlying 
physiological perturbations as the occasion for all be- 
havior, whether it be to meet other environmental de- 
mands (e.g., temperature regulation) or social behavior 
(e.g., courtship or maternal care), Recent descriptions 
by ethologists have emphasized the elicitation of such 
behavior by particular concatenations of events in the 
environments rather than by the push of physiological 
needs. In the present analysis, we have suggested that, 
except in cases of emergency, the ethological model 
which describes social behavior is also appropriate for 
feeding behavior. ‘That is, animals possess a repertoire 
of feeding strategies which are appropriate to the 
niche they occupy and which vary with varying habitats 
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within the niche. It is obvious that for many species 
these strategies are in whole or in part constructed 
and/or modified by interaction with the environ- 
ment. 

Returning to the problem of the unit of analysis 
and the question of the units upon which the conse- 
quences of behavior act, let us examine the question 
of reinforcement. ‘The classic view is that complex be- 
havior is assembled from reflexes and that conse- 
quences operate directly on the individual reflex 
units. Another possibility, however, is that behavior is 
preassembled into dynamic units and that the changes 
induced by the outcome of a behavior sequence affect 
not only the occurrence of the larger unit, but also the 
interrelations of members within the unit. For ex- 
ample, hyenas hunt in packs. When the available 
game hunted is small and the hunt results in a few 
well-fed and several frustrated and quarreling mem- 
bers of the pack, the pack will break into smaller 
units suitable to the game size. If the available game 
again becomes numerous, the pack will reassemble 
(Kruuk, 1972). This example illustrates how the con- 
sequences, insufficient food in a single meal, can 
affect a complex social structure. We would conjecture 
that a common principle underlies feeding patterns of 
individual animals as well. That is, the feeding strat- 
egies of all animals are matched to the niches in 
which they operate. Further, these feeding strategies 
are adjusted as a unit when the parameters of the 
niche change in such a fashion that the ratio of cal- 
ories gained per meal to the time and/or effort pro- 
curing the meal is maximized. Where de learning and 
motivation enter into this process? We think that 
“learning” in this situation is the process by which the 
animal modifies the parameters of his habitat, and 
motivation is basically the maximization function. We 
have shown that animals behave in such 4 fashion ag 
to maximize their ration and to reduce the effects of 
constraints placed on their normal feeding patterns. 
We assume that modifications of feeding patterns 
which increase the probability of finding suitable food 
ina given habitat will also occur, thus improving the 
niche. Consider, for example, the problem of a sit-and- 
wait predator whose food consists of two sizes of prey. 
When large prey predominates, the optimal strategy 
is to concentrate on the large prey; but as the ratio 
of large to small prey decreases, the frequency of at- 
tack on small prey should increase (Schoener, 1971). 
A second strategy, different from optimization, also is 
possible. ‘he predator can move to a part of his en- 
vironment in which the ratio of large to small prey 
is more favorable. Thus the animal can either maxi- 
mize within a given habitat or he can improve the 
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habitat by modifying its parameters. It is the latter 
skill which is so characteristic of the human animal. 
It is our conjecture that the essence of reinforcement 
lies in “controlling” the environment, where control is 
evaluated in terms of its effect on the total economy 
of the animal. 

Although the present theoretical speculations are 
crude, we think that a fruitful research strategy is one 
which attempts to study variables derived from eco- 


logical descriptions in a controlled laboratory situa- 
tion using operant analogues. This approach may 


yield a more generalized understanding of the mecha- 
nisme underlying learning and motivation than other- 


wisc has previously been possible. 
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Pavlovian Control 


of Operant Behavior 


an analysis of autoshaping and its implications 


for operant conditioning 


INTRODUCTION 


In the beginning, there was the reflex. Pavlov em- 
ployed the reflex arc as a model in establishing the 
laws of classical conditioning. Under a variety of 
conditions, stimuli which had previously had no rela- 
tion to particular reflexes could be made to trigger or 
elicit them. This process of classical (or respondent or 
Pavlovian) conditioning was taken by Pavlov as the 
basic constituent of all learning—of all adaptive modi- 
fication of behavior, 

From the outset, however, it wag clear that Pavlovw’s 
phenomenon could not explain all he hoped. The 
problem lay in the reflex arc model itself. Some, in- 
deed most, behaviors in which complex organisms en- 
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contribution to our thinking over the years has been profound 
and, we hope, obvious in these pages. Preparation of the chapter 
was facilitated by NSF grant BMS 73-01403 to the first author. 
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gaged did not appear to be elicited at all, While it was 
easy to specily the stimulus which preduced salivation 
or ficxion of a hind limb, it was difficult indeed to 
find the stimulus which triggered walking, or writing, 
oY playing the piano. These béhaviors appeared vol- 
untary and decidedly unelicited. They could net be 
captured by the reflex arc concept or by the principles 
of Pavlovian conditioning. A new learning principle 
was required to explain their development and con: 
tinued occurrence. This principle was Thorndike’s 
law of effect. 

The rest, of course, is history. It remained for Skin- 
ner to highlight the distinction between the two kinds 
of learning, to emphasize the particular importance of 
the law of effect in learning—operant or instrumental 
conditioning—and to develop a brilliant set of 
methods for the study of operant conditioning. The 
research which is the fruit of Skinner’s pioneering 
work is prodigious. ‘The study of animal learning cur- 
rently, at least in the United States, is heavily focused 
on operant conditioning. This volume and its prede- 
cessor (Honig, 1966) are testimony to the rather rapid 
development of sophisticated analyses of the principles 
which govern operant behavior. 
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Meanwhile, the study of Pavlovian conditioning 
has also progressed, although more slowly. Most of 
Pavlov’s initial findings are reproducible, and the 
domain over which Pavlovian conditioning extends 
has been enlarged. In addition, there have been occa- 
sional new ideas about Pavlovian conditioning which 
have changed our understanding of its basic nature 
(e.g., Kamin, 1969; Rescorla, 1967). 

In the midst of this atmosphere of progress, there 1s 
one problem which has consistently resisted solution: 
the problem of the relation between Pavlovian and 
operant conditioning. How are the two types of learn- 
ing to be defined? What are the processes which 
underly them? Are they mutually exclusive, or can 
they operate simultancously on the same class of be- 
havior? Are there any empirical findings which un- 
equivocally allow one to distinguish hétwaén theme 
These questions do not have secure answers. The 
eee of the problem as: seen feos to vee onc 


ie ‘types of facie Have — ace specified 
in an effort to highlight their differences (Kimble, 
1061}. What emérged from this analysis, however, was 
their remarkable similarity. Mere recent ellerts to dis- 
tinguish between the two types of Icarning (Rescorla 
%& Solomon, 1967) have been similary frustrated. De- 
hning the two types of learning in térimé SE thé pro- 
cedures uscd to produce them is the best our current 
understanding allows. 

Consider an example. The prototypic experimental 
context for the study of operant conditioning cur. 
rently is the study of the pigeon pecking at a response 
kcy. The kcy peck is presumably a voluntary behavior, 
governed by the law of effect. Yet in 1968 Brown and 
Jénkais showéd that Pavlovian éénditianing proce- 
dures could also generate and sustain pecking. 

Dees this mean that pecking is beth voluntary and 
reflexive? Docs it mean that pecking is sometimes 
voluntary and sometimes reflexive? The best one can 
dé at pracant ic suppact that pecking which is pro- 
duced by a Pavlovian procéduré is réflaxiva whilé 
pecking which is produced by an operant procedure 
is voluntary, The experimental procedure defines the 
learning process. Unfortunately, this is demonstrably 
false. Pecking which is produced by a Pavlovian pro- 
cedure is also governed by the Jaw of effect. It 1s Just 
these interactions between Pavlovian and operant 
conditioning variables in the control of key pecking 
in the pigeon which is the concern of this chapter. 
Rather than endeavor to classify instances of learning 
on the basis of the two procedures, we shall try to 
assess the joint influence of each on the occurrence of 
a single class of behavior. We hope to demonstrate 
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that Pavlovian and operant principles represent good 
analytic tools though they are problematic as classi- 
ficatory categories. 

In the first part of the chapter we shall review the 
evidence that Pavlovian procedures are sufhcient to 
produce and maintain pecking in the pigeon. Most of 
this evidence centers on the study of autoshaping and 
automaintenance. In the second part of the chapter 
we shall show that the Pavlovian control of key peck- 
ing revealed by autoshaping studies enters signif- 
icantly into many standard operant conditioning pro- 
cedures. The bulk of this section will be a discussion 
of multiple schedules of reinforcement. It will be a 
kind of case study, designed to illustrate the pervasive 
influence of Paylovian conditioning on the control of 
operant behavior. The chapter will not provide an 
axhaustive discussion of either autoshaping or multi- 
ple-schedule performance. Very thorough recent re- 
vicws have been provided by Hearst and Jenkins 
(1974) of the farmer and by Mackintosh (1974) or the 
latter, 


AUTOSHAPING AND AUTOMAINTENANCE 


Autoshaping: Necessary and 
Sufficient Conditions 


In 1968 Brown and Jenkins reported the following 
expcriment. Deprived, magazine-trained, but other- 
Wise naive pigeons were placed in a dimly illuminated 
chamber. Once every 60 sec, on the average, a response 
key was illuminated for 8 sec and followed by the 
delivery of grain. The surprising result of this pro- 
cedure was that even though food delivery was inde- 
pendent of the pigeons’ behavior, all 36 subjects 
began pecking at the illuminated key after between 
6 and 119 key-food pairings. Qnce pecking occurred, 
cach peck at the illuminated key extinguished the key 
light and produced immediate food delivery. Brown 
and Jenkins called this procedure autoshaping (be- 
cause 1t was aulomalicé, the pigeon shaped itself). As 
such, it represented an important technical advance 
over the previously used “method of successive ap- 
proximations” (e.g., Ferster & Skinner, 1957), which 
was aS inexact aS it was artful. 

However, autoshaping has represented consider- 
ably more than technological improvement. ‘The 
phenomenon has raised theoretical issues which go to 
the heart of the experimental analysis of behavior. 
The reason is that key pecking in the pigeon has been 
considered a prototypic operant—an arbitrarily de- 
fined class of skeletal behavior which is sensitive to 
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and controlled by its consequences. Indeed, most of 
our present understanding of the control of behavior 
by its consequences has come from the study of key 
pecking. Yet the autoshaping paradigm is Pavlovian: 
An arbitrary conditional stimulus or CS (key light) 
precedes an unconditional stimulus or US (grain). As 
in other Pavlovian procedures, the sequence of events 
is completely unaffected by the organism’s behavior, a 
characteristic which does not fit well with our defini- 
tion of operants. Moreover, Pavlovian procedures are 
usually said to influence reflexive, nonskeletal, non- 
operant activities. ‘Thus autoshaping seems to repre- 
sent Pavlovian conditioning of a prototypic operant 
response (Gamzu & Williams, 1971; Hearst & Jenkins, 
1974; Jenkins, 1973; Jenkins & Moore, 1973; Moore, 
1973). How can the same response be both operant 
and reflexive? How secure is the distinction between 
operant and Pavlovian conditioning? Is susceptibility 
to Pavlovian influence unique to key pecking, or do 
all operants share this property? Are there other 
phenomena which demonstrate Pavlovian control of 
key pecking which have been overlooked or misinter- 
preted in the past? These and other questions will be 
addressed in the sections which follow. The first order 
of business, however, is an assessment of whether auto- 
shaping is unequivocally the product of a Pavlovian, 
stimulus-reinforcer association. In this section we shall 
consider the conditions which are necessary and sufh- 
cient for the acquisition of key pecking in autoshap- 
ing procedures. 

Just as in standard studies of Pavlovian condition- 
ing, a variety of control procedures are required to 
ascertain whether an association of CS (key light) and 
US (food) is what produces autoshaping. Many of 
these control procedures were included in Brown and 
Jenkins’s demonstration of autoshaping. The results 
and procedures are shown in Figure 1. Brown and 
Jenkins reported only the trial on which the first key 
peck occurred. ‘The reason for this is that in all but 
the last of their procedures, a key peck immediately 
produced food. ‘Thus once the first peck occurred, the 
response-reinforcer relation would exert control over 
pecking and increase its likelihood. In most later 
studies of autoshaping, food delivery has always been 
independent of responding. This makes the study of 
pecks after the first one of interest. The first panel 
shows the basic procedure already described. The 
second panel shows that when key and food were 
paired in reverse order (backward conditioning) only 
2 of 12 pigeons pecked the key. When the key was 
illuminated without food (CS only—third panel), no 
pigeons pecked the key. When the key was continually 
illuminated and food was presented periodically (US 
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Fig. 1. Schematic representation of the procedures used by 
Brown and Jenkins and summary acquisition results. The top 
procedure is the standard autoshaping paradigm. (From Brown 
& Jenkins, 1968. @ 1968 by the Society for the Experimental 
Analysis of Behavior, Inc.) 


only), 4 of 12 pigeons pecked the key (fourth panel). 
When the key light was illuminated for 3 rather than 
8 sec, 21 of 22 pigeons pecked the key (fifth panel). 
When a dark key was paired with food, only 2 of 6 
pigeons pecked the key, while when a red key (which 
was otherwise white) was paired with food, all 6 
pigeons pecked the key (sixth and seventh panel). 
These control conditions appear to demonstrate that 
pairing is necessary and sufficient (except when a dark 
key is the CS) to produce autoshaping. 

However, in the province of standard Pavlovian 
conditioning, Rescorla (1967) has persuasively argued 
that none of the control procedures employed by 
Brown and Jenkins are sufficient to allow an un- 
equivocal conclusion that autoshaping is the result of 
CS-US relation. What is needed is a procedure in 
which CS and US are neither paired nor unpaired, but 
independent of one another. This, of course, is the 
truly random control procedure. In such a procedure, 
pairings of CS and US occasionally occur, but the 
occurrence of the CS provides no information about 
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the likelihood of occurrence of the US; i.e., the prob- 
ability of the US given the CS equals the probability 
of the US given no CS [P(US/CS) = P(US/CS)]. Peter- 
son (1972), Bilbrey and Winokur (1973), and Wasser- 
man, Franklin, and Hearst (1974) have employed this 
procedure and have shown that it does not result in 
autoshaping. The procedure has been studied most 
extensively, however, by Gamzu and Williams (1971, 
1973), and it is to their work we now turn in discuss- 
ing the importance of the informativeness of the CS 
for autoshaping. 


Informativeness of the CS 


CONTINGENGY Vs. PAIRING 


Gamzu and Williams (1971, 1973) studied a variant 
of the autoshaping procedure similar to that used in 
recent studies of Pavlovian aversive conditioning 
(Rescorla, 1968). A response key was periodically 
illuminated for 8.6-sec trials, with a variable intertrial 
interval (ITI) with a mean of 30 sec. Once every 
second during the trial a random probability generator 
was sampled, and an output occurred with a probabil- 
ity of .03 (9 — .03). Fach output operated the feeder for 
4 sec. Thus food was delivered once every 33 sec or once 
every fourth trial, on the average, This differential 
procedure is depicted in the second panel of Figure 2. 
It differs from standard autoshaping in two respects: 
first, food is not deliveréd in évery trial; second, food 
can be delivered at any moment during the trial, not 
just at the end. Key pecking was acquired and main- 
tained at high rates in all pigeons tested. “The pro- 
ééduré was then médifiéd 86 that the probability gen. 
rate during intertrial intervals as during trialsi the key 
was nondifferentially related to food, i.c., food was as 
likely in its absence as in its presence. ‘his procedure 
ig equivalent f6 the truly random contrdl procedure 
(Rescorla, 1967) and 1s depicted in the third panel of 
Figure 2, Pigcens exposed to this procedure did not 
peck the key. The two other control procedures de- 
picted in Figure 2 were also run. In one (differential- 
absence) food was presented only in the ITLL and in 
the other (no-reinforcement) food was never pre- 
sented. Neither procedure generated consistent key 
pecking. The birds on these three procedures were 
then exposed to the differential procedure, and peck- 
ing was obtained in all of them. Thus key pecking was 
reliably obtained only when the illuminated key was 
differentially associated with food. These data led 
Gamzu and Williams to conclude that the acquisition 
(and indeed, as we shall see later, the maintenance) of 
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Fig. 2. Schematic representation of (a) the basic autoshaping 
paradigm and (b-e) procedures in which food presentation was 
determined by sampling a probability gencrator (PG): (b) The 
differential procedure, in which food was presented randomly 
in time, but only during illuminated key trials. (c) The non- 
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in time during trial and intertrial intervals. (d) The differential- 
absence procedure, in which food was presented only in the 
intertrial interval. (e) The nonremforcement procedure, in 


which food was never presented. (From Gamzu & Williams, 
1973. © 1973 by the Society for the Experimental Analysis of 


Behavior, Inc.) 


auteshaped key pecking depended on the information 
the CS conveyed about the US, rather than a mere 
pairing of the two. These conclusions parallel those 
drawn by Rescorla (1972) about the crucial relation- 
ships in Pavlovian conditioning. 


RepuNDANGY 


Fgger and Miller (1962, 1963) suggested that the in- 
formativeness of a stimulus with respect to reinforce= 
ment determines its power as a secondary reinforcer. 
They pointed out that just because a stimulus always 
precedes reinforcement does not guarantee that it 1s 
informative. For example, if a series of pairings al- 
ways consists only of tone followed by food, then the 
tone is informative. If, however, the tone is always 
preceded by a light, then the tone is now redundant 
because the light reliably predicts reinforcement, and 
that “information” is available before the tone is 
sounded. However, if the light is sometimes followed 
by tone and food and sometimes not, then the mere 
presentation of the light does not “guarantee” rein- 
forcement, while the tone does. Thus, the tone is once 
again informative. Egger and Miller substantiated this 
argument by showing that the relative efficacy of two 
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stimuli as secondary reinforcers depended on whether 
they were informative or redundant in the sense out- 
lined. 

Allaway (1971) tested the Egger and Miller concep- 
tualization of informativeness in autoshaping. ‘Three 
groups of pigeons were exposed to a basic fixed-trial 
autoshaping procedure in which 6-sec key illumina- 
tions were always followed by access to food. For one 
group (key-only) no additional manipulations were 
involved. For the second group, 2 sec before each 
illuminated key trial a tone was sounded for 8 sec 
after which food was delivered. For this group the 
illuminated key was redundant (key-redundant). ‘The 
third group had the same conditions as the key-redun- 
dant group, but in addition an equal number of 8-sec 
tone trials that were associated with neither key nor 
food (tone-irrelevant). ‘Throughout the experiment, 
reinforcement was delivered response-independently. 
In general, key pecking was far less frequent when the 
key was redundant than in either of the other two 
procedures. Indeed, some naive birds failed to auto- 
shape when the lit key was always preceded by a tone. 
Allaway’s data confirm the fact that the important 
feature of the CS—-US relationship in autoshaping is 
the informativeness of CS with respect to the US and 
not merely the pairing of the two. Wasserman and his 
co-workers have obtained data consistent with Alla- 
ways with a variety of similar procedures (Wasser- 
man, 1972, 1973b, 1974; Wasserman & Anderson, 
1974; Wasserman & McCracken, 1974). 


TRIAL AND ITI DURATION EFFECTS 


Another aspect of informativeness which is net cap- 
tured by measures of conditional probability is the 
importance of relative trial and intertrial interval 
(ITT) durations. To understand the informativeness 
of these events, consider the following analogy. Imag- 
ine waiting at a train station, through which five 
differently numbered trains continually pass. Only one 
number will take you to your destination. If the trains 
arrive once every minute, you can afford to read your 
copy of this volume and only look up occasionally. 
However, if the trains come in once every 10 minutes, 
you will certainly pay more attention to the train 
arrivals (trials). Consider now the difference between 
waiting at the terminal and waiting at a through sta- 
tion. At the terminal the trains roll in and wait for, 
say, a minute, so that once you notice that a train is 
in, you can go over and inspect the number. How- 
ever, at the through station the train only stops for 
20 seconds, so that it is necessary to be constantly 
alert in order to determine whether the incoming 
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train is the one you want to board. As the waiting 
time (trial duration) gets shorter until the train stops 
only on demand, the vigilance required increases. The 
relevance of this analogy to autoshaping is as follows: 
we would expect that with ITI held constant, shorter 
trials will convey greater information, and will thus 
engender more pecking. Similarly, with trial duration 
constant, longer ITIs will generate more pecking dur- 
ing a trial. 

In their very first experiments, Brown and Jenkins 
(1968) compared an 8-sec trial with a 3-sec trial but 
did not find much difference in rate of acquisition. 
Ricci (1973), however, reported that with a constant 
mean III of 4 min, autoshaping was much more 
rapid when the trial duration was 30 sec than when it 
was 120 sec. 

An extensive study of trial duration has been made 
by Baldock (1974). When ITI duration was held con- 
stant, the rate of acquisition of autoshaping was in- 
versely related to trial duration; that is, autoshaping 
was most rapid when the CS was brief (4 sec) and was 
gradually less rapid as the trial duration increased 
(up to 32 sec). 

Terrace, Gibbon, Farrell, and Baldock (1975) in- 
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Fig. 3. Median number of reinforcements prior to first trial 
on which a peck occurred as a function of the mean intertrial 
interval. ‘Trial durations were always 10 sec. (From Terrace et al, 
1975.) 
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vestigated the other side of this coin. They used a 
constant 10-sec trial and varied the mean ITI from 
5 to 400 sec. The rate of acquisition was a direct 
monotonic function of the ITI. Some of the data are 
presented in Figure 3. ‘Thus both intuitions about the 
role of trial and ITI durations are confirmed. The 
next logical step was to combine them. Baldock 
(1974) showed that the important dimension in deter- 
mining the rate of acquisition of pecking is the ratio 
of trial to ITI time. Over a wide range of trials (4 to 
32 sec) and mean ITs (8 to 78 sec), acquisition was 
always more rapid when the ratio of trial to [TI was 
smaller. 

We can summarize briefly what is presently known 
about the necessary CS-US relationships for auto- 
shaping to occur: specific pairings are neither neces- 
sary nor sufficient. Rather, the GS must provide in- 
formation (broadly construed) about the occurrence 
ef the US. The more informative the CS is, the more 
rapid acquisition is.* 


Types of US and the Relation 
Batwaan tha US and Raspansa 


The apparent similarity between the conditions 
necessary to produce autoshaping and the necessary 
conditions far more standard demonstrations of Pav- 
lovian conditioning leads one to look for further 
cvidence that autoshaping is an instance of Pavlovian 
conditioning. One obvious thing to investigate is the 
rélation between the US or reinforcer and the con- 
ditioncd response. Paylovian conditioned responses 
typically arc somc component of the unconditioned 
responsc to the US. In this section we revicw cvidence 
that the same is true of autoshaping. 

Im the most common autoshapitie situati6n, with 
pigeons as subjects and food as the US, the condi- 
tioned response is key pecking, Pecking is of course 
the major component of the pigcon’s feeding reper- 
toire (e.9.. Craig, 1917). Other species also show this 
similarity between the response elicited by the rein- 
fercér aiid thé auitoshapéd LEEPOnse. Hor example, 
bobwhite quail peck to feed and peck a light that 


1 Throughout this discussion we haye finessed the problem 
that the concéept “informativeness” cannot at present be defined 
to include all the senses in which we have used it here. There 
have been a few attempts to provide such a definition (Bloom- 
field, 1972: Gibbon, Berryman, & Thompson, 1974; Rescorla, 
1972), but none have been complete, Most recently, Gibbon et al. 
(1974) have proposed a metric for evaluating contingencies in 
classical and instrumental conditioning. Their model can in- 
corporate the conditional probability studies and trial and ITI 
duration studies described above. It cannot account for the 
redundancy effects. It is, however, the most thorough and 
complete account to date. 
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signals food (Gardner, 1969). Squier (1969) in a discus- 
sion of autoshaping in different species of fish, stated 
that “the topography of key responses varied, each 
type closely resembling the consummatory response” 
(p. 178). Smith and Smith (1971) demonstrated 
autoshaping in the dog, and S. G. Smith (personal 
communication) remarked that licking the response 
key was observed. Rats lick and gnaw at laboratory 
food, and indeed Peterson, Ackil, Frommer, and 
Hearst (1972) reported that rats react in the same way 
to a response lever wh¢u it is the signal for food in an 
autoshaping paradigm. Stiers and Silberberg (1974) 
have made similar observations. The two exceptions 
to this general trend were reported by Sidman and 
Fletcher (1968), who autoshaped rhesus monkeys for 
food, and Gamzu and Schwam (1974), who studied 
squirrel monkcys. Sidman and Fletcher noted that “‘al- 
though the monkey used its fingers both to press the 
key and to pick up the pellet, the topography of these 
two behaviors is quite different” (p. 308). Gamzu and 
Schwam found similar results and reported that some 
monkeys eventually made nose-pressing responses, 
They suggested that these response-topography differ- 
ences resulted from the fact that the consummatory 
behavior of the monkey is quite variable in compar- 
ison to that of the pigeon. It is possible that in species 
with varied feeding behaviors, the particular behavior 
one observes will be governed primarily by the manip- 
vlandum. For example, Moore (1973) reported that 
monkeys grasped and bit a protruding key “as if it 
were food” (p. 187). 


Foop Versus WATER 


Jenkins and Moore (1973), Moore (1975), Morri- 
son (1974), and Woodruff (1974) have all autoshaped 
pigeons using water rather than food as the reinforcer. 
All found that the autoshaped response resembled 
drinking movements. Perhaps the most elegant demon- 
stration of the relationship between US and autoshaped 
response can be found in Jenkins and Moore’s (1973) 
study. First they made high-speed films and videotapes 
of the unconditional behaviors to food and water. Us- 
ing these as prototypes, judges were then asked to eval- 
uate the response form of birds autoshaped for food or 
water. The judges who were presented with videotapes 
of the learned response only (but not with the whole 
sequence, which would have allowed them to see the 
reinforcer itself) correctly identified the approach and 
contact movements as grain-related or water-related 
on 87% of the trials. ‘The response was either like 
grain pecking or like drinking. Sample photographs of 
the two types of key pecking are presented in Figure 4. 
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Fig. 4. Photographs of pigeons pecking at response keys. Pictures 
(a) and (b) show the consummatory topographies obtained when 
key responses are autoshaped with water (a) and food (b) rein- 
forcement; (c) and (d) show a food-reinforced instrumental 
response. The spot of light is a discriminative stimulus. Part (e) 
shows the consummatory reaction which often arises even when 
lever pressing rather than key pecking is reinforced. When 
presses are reinforced only in the presence of some positive cue, 
the cue itself may elicit consummatory reactions, as shown in (f). 
(From Moore, 1973.) (photo by Roy DeCarava) 


In their second and third experiments, Jenkins and 
Moore showed that the determinant of the response 
form was the reinforcer itself and not the depriva- 
tional state. As long as the key predicted food and not 
water the response form was foodlike, even if the 
dominant deprivational state was changed to thirst. 
Finally, in a most convincing experiment, Jenkins and 
Moore exposed food- and water-deprived pigeons to a 
procedure in which two different colored key illu- 
minations were used, one to signal food and the other 
to signal water. In most pigeons the response to the 
stimulus predicting water reinforcement was a drink- 
inglike movement, while the response to the signals 
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for food was appropriately like a grain peck. Because 
the stimuli were presented randomly and equally 
often and because both deprivation states were in- 
duced, these data represent the clearest evidence for 
the dependence of the autoshaped response on the 
actual consummatory response. 


OTHER USs 


Farris (1967) conditioned the courting behavior of 
three male Japanese quail. On four separate occasions 
each day a buzzer was sounded for 10 sec at the end of 
which a female quail was introduced to the cage and 
was left there until copulation occurred or 1 min had 
passed. ‘TThe CS overlapped the presence of the female 
by 5 sec. Within as few as 5 pairings, part of the male 
display began to occur in the presence of the buzzer. 
After 32 pairings all components of the characteristic 
male display were reliably elicited by the CS in all 
three birds. In a similar experiment, Rackham (1971, 
cited in Moore, 1973) exposed pigeons to repeated 
pairings of a stimulus light and a sexual reinforcer. 
He, too, found that the behavior to the signal (and in 
this case it was directed to the visual signal) strongly 
resembled the unconditioned response that would be 
elicited by the forthcoming reinforcement. 

Similar findings have been reported for Pavlovian 
conditioning of aggressive display in Siamese fighting 
fish (Adler & Hogan, 1963; Murray, 1974; Thompson 
& Sturm, 1965). Thompson and Sturm, for example, 
paired a red stimulus light with a mirror presentation 
and found gradual acquisition of conditioned be- 
havior that was identical to the unconditioned aggres- 
sive display. A study by Rachlin (1969) could also be 
interpreted as an instance of autoshaped aggressive 
behavior. Rachlin exposed pigeons to an autoshaping- 
like procedure with shock as the US. A response on 
the key, which was fitted with a transparent hemi- 
spheric extension, turned off the shock. Rachlin found 
that key responding could be conditioned, but that 
some birds pecked the key while others struck it with 
their wings. Moore (1973) has suggested that since 
wing flapping and pecking are both parts of the 
pigeon’s normal aggressive behavior pattern, and since 
shock elicits aggression, Rachlin’s data are another in- 
stance of reinforcer-appropriate conditioned behavior. 

Wasserman (1973a) recently reported that 3-day-old 
chicks would peck at a key, the illumination of which 
always preceded the illumination of an overhead heat 
lamp. In order to insure the reinforang quality of 
heat, the experiments were conducted in a cold cham- 
ber (5-15°C). Appropriate control groups were run, 
and the subjects in those groups seldom pecked the 
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key. In the group exposed to the pairings of key 
illumination and heat, seven of eight chicks pecked 
the CS within the first 20 trials, although on some 
trials they contacted the key with a “snuggling” re- 
sponse. The unconditioned response to the heat lamp 
was described as a reduction of activity accompanied 
by an extension of the wings and the emission of 
twittering sounds, Other aspects of the response were 
less uniform. Both Hearst and Jenkins (1974) and 
Wasserman (1973a) regard this experiment as showing 
autoshaping of a response that is different in topog- 
raphy from the response to the US. Hogan (1974) has 
pointed out that pecking and snugeling are part of 
the normal heat-secking repertoire of chicks, however. 

Peterson, Ackil, Frommer, and Hearst (1972) im- 

planted clectrodes in the lateral hypothalamus of rats. 
After it was determined that the sites were positively 
reinlorang, the rats were exposed to autoshaping. 
Onc illuminated retractable lever (C5+) was inserted 
for 15-sae periods after which a train of rewarding 
stimulatien was presented, A second lever (CS—) was 
presented cqually oftcn but was never paired with 
reinforcement. Each rat soon began to approach the 
CS+ and “sniff” it, making contact with the lévér 
with its whiskers. The GS- was gencrally ignored. 
Peterson et al. reported that the exploratory behavior 
in the vicinity of thé signal for brain stimulation was 
quite constant for a given rat, but varied among rats. 
However, “there scemed to be a definite relation be- 
tween the behaviors directed at the CSt and those 
elicited by the braii stimiilatia6n! if an animal sniffed 
or displayed certain postural adjustments during US 
prescntation, we often noticed fragments of the same 
pcneral pattern during prescntation of the GS*” (p. 
1011). In experiments with a design similar to the 
énkitis aiid Modie (1973) experiment, Peterson 
(1972) confirmed and extended these findings. Rats 
that were beth hungry and had electrode implants 
showed directed responses to a GS*+ lever that were 
Classified either as licking-enawing or sniffing-explor- 
ing. These were perfectly correlated with the type of 
réinftorcér aiid correspondéd to thé topderaphy of the 
response to the US. 

Finally, Woodruff (1974) employed an ingenious 
technique to study the importance of localizability of 
the US. A small hole was made in the upper mandible 
of pigeons and a chronic cannula implanted. Chrough 
this cannula small amounts of water were delivered as 
the USs in an autoshaping procedure. When water 
was placed directly in the beak, the pigeons drank it, 
usually without any peckinglike behavior. Nonethe- 
less, key pecking was autoshaped, and, not surpris- 
ingly, the topography of the peck was “drinking’’-like 
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and altogether indistinguishable from key pecks auto- 
shaped with water presented in the standard fashion. 
Thus it appears that neither the sight of the US nor 
the occurrence of US-appropriate consummatory be- 
havior is a necessary feature of autoshaping with 
water. 


SUMMARY 


A wide variety of reinforcers in a number of species 
have been studied in autoshaping experiments. The 
overwhelming impression derived from these data is 
that the autoshaped response usually bears a remark- 
able resemblance to the response elicited by the rein- 
forcer. There are two types of exceptions to this state- 
ment, The first are the counterexamples, particularly 
in the primates (but also Wasserman’s experiment 
with heat as the US in chicks). Moore (1973) feels that 
the use of an appropriate stimulus /manipulandum 
would convert the primate work into positive examples 
of response similarity. Gamzu and Schwam (1974) 
and Schwam and Gamzu (1975), on the other hand, 
have argued that in primates one ought to expect dis- 
similarities between conditioned and unconditioned 
responses, because the latter are so variable. Indeed, 
it is difficult im squirrel monkeys to specify what 
skeletal behavior will be elicited by the presentation 
of food. The second type of exception 1s less severe. 


directly in the beak) are not localizable in the cham- 
ber. Yet in both cases the autoshaped behavior is 
éléarly directed toward the signal. It should be 
pointed out that it is only the directed aspect of 
these behaviors that is problematic. In beth cases the 
autoshaped response is clearly similar to the uncon- 
ditioned behavior. 


Dissyssien and Censlysieni 


Autoshaping as Pavlovian Conditioning 


Brown and Jenkins (1968) in their discussion of 
the autoshaping phenomenon posed the possibility 
that key pecking might have emerged as a result of 
Pavlovian conditioning, although it seemed unlikely 
to them at that time. Since then, autoshaping has 
been interpreted as an instance of Pavlovian cond1- 
tioning with varying degrees of reservation (cf. 
Gamzu, 1971; Peterson et al., 1972). Perhaps the most 
wholehearted adoption of the Pavlovian conditioning 
account of autoshaping can be found i Moore 
(1973), who brings together a great deal of evidence in 
support of this approach. The data that have been 
presented here overwhelmingly suggest that autoshap- 
ing is Pavlovian. First of all, in all cases of auto- 
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shaping that have been extensively studied the crucial 
variable has always been the signaling relationship 
between the CS and US. Autoshaping occurs if and 
only if the CS reliably predicts a period of relatively 
higher density of reinforcement than otherwise ob- 
tains. Control procedures such as the truly random 
control, CS-alone, US-alone, and backward-pairing all 
fail to generate the autoshaped behavior. Indeed, 
when a CS~ is paired with the absence of food, 
pigeons tend to move away from it (Jenkins & Boakes, 
1973; Wasserman et al., 1974) and one can indepen- 
dently demonstrate that such CS~s have the inhib- 
itory properties (Wessells, 1973) that are predicted 
from Pavlovian theory. Secondly, the responses that 
are autoshaped are far from being arbitrary. On the 
contrary, they tend to be very constrained within a 
given species and are demonstrably similar to the re- 
sponses that are unconditionally elicited by the par- 
ticular reinforcer being used. 

Given the quality and quantity of the data, it is 
surprising that there is still considerable reluctance to 
accept the notion that autoshaping is Pavlovian (cf. 
Herrnstein & Loveland, 1972; Hursh, Navarick & Fan- 
tino, 1974). Other than dogma, there appear to be 
three reasons for this reluctance: the directedness of 
the response, the absence of a deleterious effect of 
partial reinforcement, and a continuing dispute about 
the process presumed to underlie Pavlovian condi- 
tioning. 

Among the many aspects of autoshaping that make 
it an interesting phenomenon is the directedness of 
the response. Indeed, this may be the only way of 
distinguishing autoshaping phenomena from more 
familiar instances of Pavlovian conditioning. ‘Tradi- 
tionally, salivation, galvanic skin response (GSR), 
heart rate, and eye blinking have been the behaviors 
studied in Pavlovian conditioning experiments. None 
of these behaviors could be called directed. However, 
when other components of behavior are noted, 
directed behavior is often observed. For example, 
Pavlov noted that dogs licked an electric bulb that 
was a CS for food (Pavlov, 1955); indeed, if the stim- 
ulus was within reach, the dog usually tried to touch 
it with its mouth (Pavlov, 1941). Similar findings 
were reported by Zener (1937). Thus directedness of 
behavior has been observed in Pavlovian conditioning 
experiments but simply was not the focus of the re- 
search. As a result, it has been more or less ignored. 
Autoshaping redresses that wrong and provides a 
vehicle for the study of Pavlovian control of directed 
skeletal behavior. 

The reader will have already noted that partial 
reinforcement does not seem to have a deleterious 
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effect on autoshaping (e.g., Farrell, 1974; Gamzu & 
Williams, 1971, 1973; Gonzales, 1973, 1974; Schwartz 
& Williams, 1972a), although it is claimed severely to 
retard acquisition of classically conditioned responses. 
Indeed, Spence (1966) suggested that this effect of 
partial reinforcement be used to distinguish between 
Pavlovian and operant conditioning. However, Gor- 
mezano and Moore (1969) summarized the literature 
as being equivocal. Grant and Schipper (1952) found 
conditioning to be unimpaired by partial reinforce- 
ment. More representative are findings that Pavlovian 
conditioning (of eye blink reflex on the whole) will 
occur at partial reinforcement (even as low as 25%), 
but that the magnitude of the effect is smaller than in 
100% control groups (e.g., Ross, 1959). Another effect 
of partial reinforcement is to increase resistance to 
extinction. This phenomenon is well documented in 
operant learning (cf. Lewis, 1960) but it is not always 
seen in Pavlovian conditioning in infrahuman sub- 
jects (e.g., Thomas and Wagner, 1964). However, the 
original report of this partial reinforcement effect 
(PRE) by Humphreys (1939) was based on classical 
conditioning. Recently Hilton (1969) has shown a 
clear-cut PRE using a conditioned emotional response 
(CER) paradigm which is most commonly interpreted 
as resulting from Pavlovian conditioning (cf. Rescorla 
& Solomon, 1967). Indeed, although Hilton provides 
no details of acquisition, he does indicate that all the 
groups (consistent and partial reinforcement) showed 
equally effective complete suppression to the CS. This 
very cursory review should be sufficient to indicate 
that the effects of partial reinforcement in Pavlovian 
conditioning are sufficiently equivocal that they cannot 
possibly be used to distinguish it from operant con- 
ditioning or to define a phenomenon as an instance of 
Pavlovian conditioning. Thus it seems illogical to 
refute a Pavlovian conditioning approach to auto- 
shaping on these grounds. 

Finally we come to the mechanism presumed to 
underlie Pavlovian conditioning. If there were one or 
more clear-cut mechanisms that were unequivocally 
acceptable as an explanation of Pavlovian condition- 
ing, then it would indeed be fair to ask that all the 
autoshaping data be encompassed by one or more of 
these mechanisms. Unfortunately, this 1s not the case, 
and thus we are left with an important theoretical 
problem for behavioral psychology. 

Stimulus substitution is the most commonly cited 
mechanism and is often the only mechanism con- 
sidered (cf. Terrace, 1973). It refers to the view that 
the CS in a Pavlovian conditioning experiment comes 
to substitute for the US and generate responses iden- 
tical to those produced by the US. The lack of well- 
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established alternatives is surprising, since it is quite 
clear that a literal interpretation of stimulus substi- 
tution is inadequate to explain even the most un- 
equivocal examples of Pavlovian conditioning. ‘Typ- 
ically one observes CRs which (a) omit portions of the 
unconditioned response (UR) and (b) include com- 
ponents which are absent from the UR. Because of 
the documented similarity of autoshaped and con- 
summatory responses, stimulus substitution has been 
the suggested mechanism for autoshaping. The short- 
comings of explanations of autoshaping that are based 
on Pavlovian conditioning are often related to the 
weakness of the stimulus-substitution concept. It 
scems unreasonable to cxpect conceptualizations of 
autsshaping £6 be more precise than the theery on 
which they are based. 

f more reasonable approach is to use autoshaping 
phenomena 48 4 t66l fer better understanding the 
mechanisms of Paylevian senditioning. To some ex- 
tcnt this has already happened. For one thing, alterna- 
tives 16 stimulus substitution havé begun to be artic- 
ulated. Hearst and Jenkins (1974) have pointed out 
that Pavlov himself probably thought of the CS as a 
suyvegate rather than as a substitute for the US. In- 
deed, Gamzu (1971) preferred the term stimulus sur- 
rogation as capturing the essence of the idea without 
the limitations of stimulus substitution. Bindra (1972) 
é6ngdéréd thé US an unconditional incentive stim- 
ulus and the US a conditional incentive stimulus, 
This apparently climinatcs some of the problems 
posed by the stimulus-substitution concept (cf. Hearst 
Ma jankins, 1O74! Mé&ésreé, 1973}, but still leaves thé CS 
As surrogate for the US. Hearst and Jenkins (1974) 
have seined the term ebject substitution as capturing 
Bindra’s approach. The most recent attempt to specify 
the mechanism arises from the work of Woodruff and 
Williams on autoshaping with water injected directly 
infé the béak. Williams (1974) and Woodruff (1974) 
have referred to this experiment as demonstrating that 
the key in auteshaping is a “learned releaser.” Wil- 
liams points out that food on the tongue and not the 
sight of food is the important feature of Pavlovian 
salivary conditioning. Likewise, grain in the mandi- 
bles and not the sight of grain is the US in autoshap- 
ing. Via Pavlovian conditioning the sight of grain 
(paired with grain in the mandibles as the US) re- 
leases pecks at grain. Similarly, by Pavlovian condi- 
tioning the response key (paired with the sight of 
grain) releases pecking. 

Most of these approaches to the mechanism of auto- 
shaping can be transposed to other approaches with 
appropriate assumptions. Which is the most accept- 
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able approach or whether they clearly differ from one 
another are matters for further exploration. Nonethe- 
less, this debate clearly revolves around facts that indi- 
cate that Pavlovian conditioning—whatever its under- 
lying mechanisms may turn out to be—is the major 
learning process in the acquisition of autoshaping. 
The next section will indicate that this is also true 
in the maintenance of autoshaped behavior, where 
operant relations also exert a powerful role—some- 
times in opposition to Pavlovian relations. 


Automaintenance: Interaction of 


Pavlovian and Qperant Contingencies 


The phenomenon of autoshaping seems like such a 
straightforward cxample of Pavlovian conditioning 
that one wonders why it has generated so much inter- 
ést and reséarch activity. While 1t is true that key 
pecking has not traditionally been viewed as a mem- 
ber of the class of behaviors which is susceptible to 
Pavlovian procedures, there has been virtually no 
systematic investigation of key peck acquisition in the 
past, and the discovery of a method other than shap- 
ing by successive approximation for instituting key 
pecking might well have been viewed as just another 
fact and a methodological convenience. In relating 
autoshaping to the operant conditioning literature, 
one might reasonably adopt this model: Pavlovian 
conditioning procedures may be used to produce the 
first Key peck. This peck is followed by food, and the 
jaw of effect then takes over. Thus autoshaping and 
thé control of behavior by its consequences reflect 
independent processes. This, indeed, is not unlike the 
initial tentative explanation of autoshaping put forth 
by Brown and Jenkins (1968). However, such an ac- 
count is dramatically inadequate. 

The experiment which demonstrated that the pro- 
cessés which underlie autoshaping extend beyond 
the acqursetion of key pecking, and which is probably 
responsible for the enermous research interest in the 
phenomenon, was conducted by Williams and Wil- 
liams (1969). Naive pigeons were exposed to trials in 
which the brief illumination of a key light was fol- 
lowed by food. Until the first peck occurred, the pro- 
cedure was almost identical to the Brown and Jenkins 
procedure, ‘he crucial difference was this: if the 
pigeon pecked the key, the trial was terminated and 
food was omitted. What might one expect the results 
of such an experiment to be? The Pavlovian contin- 
gency would generate the initial key peck (auto- 
shaping). However, key pecks would not be followed 
by food. Indeed, they would prevent food delivery, 
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while any other behavior that the pigeon engaged in 
would be followed by food. The expectation 1s clear: 
the negative contingency between pecking and food 
would quickly eliminate pecking. The result, how- 
ever, was that key pecking was maintained at substan- 
tial frequency over many hundreds of trials in vir- 
tually all pigeons. As Williams and Williams (1969) 
noted, this phenomenon is virtually identical to what 
Sheffield (1965) observed in studying the effects of a 
similar procedure on salivation in dogs. Sheffield 
labeled the phenomenon omission training. Williams 
and Williams labeled their finding automaintenance. 
It has since been referred to in the literature as nega- 
tive automaintenance to highlight the negative re- 
sponse-reinforcer contingency. This, however, is a 
rather cumbersome term, and so we shall hereafter 
refer to it as omission training or the omission effect 
(cf. Hearst & Jenkins, 1974). 

The Williams and Williams study made it clear 
that the Pavlovian pairing of key and food con- 
tributed to the maintenance of responding, and not 
just to its acquisition. It demonstrated a clear viola- 
tion of the law of effect. The phenomenon has since 
been demonstrated in numerous experiments (@.2., 
Herrnstein & Loveland, 1972; Schwartz, 1972, 1973b; 
Schwartz & Williams, 1972a, 1972b), though eccasional 
investigations have failed to obtain it reliably (€.g.. 
Hursh et al., 1974). Jenkins (see Hearst & Jenkins, 
1974) has also reported a variant of the effect. In a 
study described as the ‘long box” experiment, pigeons 
were exposed to autoshaping trials in an unusually 
long chamber. Response keys, which signaled food, 
were located at the ends of the chamber, and the 
feeder was located in the center of the chamber. Key 
pecking was acquired and maintained under these 
conditions despite the fact that when the pigeons 
pecked the key, it took them so much time to move 
from the key to the feeder that food presentation usu- 
ally terminated before they arrived. Thus, in this pro- 
cedure, key pecking did not prevent food presenta- 
tion, but it effectively prevented food consumption. 

What is one to make of the omission effect? ‘The 
maintenance of key pecking in the face of a negative 
response-reinforcer contingency strongly supports the 
view that stimulus-reinforcer relations dominate re- 
sponse-reinforcer relations in controlling key pecking 
in autoshaping-type procedures, and even suggests 
that response-reinforcer relations might exert no con- 
trol at all. We shall evaluate these possibilities in the 
following sections, focusing first on the role of stimulus- 
reinforcer relations in automaintenance, and second 
on the role of response-reinforcer relations. 


63 


STIMULUS-REINFORCER RELATIONS IN 
AUTOMAINTENANCE AND OMISSION 


Many of the studies already discussed in connec- 
tion with the Pavlovian control of acquisition of 
responding are also relevant to the question of 
maintenance. ‘The simplest demonstration of auto- 
maintenance is a study by Schwartz and Williams 
(1972b). Pigeons were exposed to 6-sec trials which 
terminated with food. Responses had no programmed 
consequence. Key pecking was nevertheless main- 
tained at rates between 8 and 15 responses per trial 
Over many sessions in all pigeons. While this simple 
demonstration of automaintenance seems to suggest 
Pavlovian control of pecking, there is, of course, an 
alternative explanation. We can presume that the 
control over initial responding is Pavlovian. How- 
ever, once these responses occur, they are followed in 
time by food. There is thus an adventitious response- 
reinforcer relation which may contribute substantially 
to the maintenance of responding once initiated 
(Herrnstein, 1966; Skinner, 1948). A series of studies 
by Gamzu and Schwartz (1973), Gamzu and Williams 
(1971, 1973), and Schwartz (1973a) strongly sugeest 
that such an account is inadequate, Let us suppose 
that responding during trials was being maintained 
by an adventitious response-reinforcer relation. What 
influence would the delivery of food during the inter- 
trial interval have on this putative relation? It seems 
Clear that these extra reinforcements might be ex- 
pected to increase pecking, or perhaps not influence 
it. Certainly, they would not be expected to déc¥GAase 
pecking. On the other hand, from the Pavlovian point 
of view, food deliveries during the ITI would decrease 
the differential predictiveness of the trial stimulus 
and, as a result, decrease Pavlovian control over peck- 
ing. The Pavlovian view makes the paradoxical pre- 
diction that increasing the rate of food delivery will 
decrease responding. ‘This indeed is what occurs. In 
the Gamzu and Williams studies, in which trials were 
8.6 sec long and ITIs averaged 40 sec, food delivery 
during the ITI at the same rate as during the trial 
virtually eliminated responding. In the Gamzu and 
Schwartz (1973) study, which involved the regular al- 
ternation of two key colors for 27-sec periods, when 
food was presented in only one key color substantial 
responding was maintained to that color. When food 
was then presented in both colors with equal fre- 
quency, responding was substantially decreased, 
though not eliminated. The Schwartz (1973a) study 
was similar to that of Gamzu and Schwartz except that 
decreases in responding were even more dramatic. It 


64 


should be noted that these studies cannot logically 
rule out the possibility that adventitious response- 
reinforcer relations contribute to the maintenance of 
responding on autoshaping procedures. Only pro- 
cedures employing a negative response-reinforcer de- 
pendency can do that. However, they do make it clear 
that such relations are not sufficient. 


Tue Response-REINFORCER RELATION 


The meré demonstration of the omission effect sug- 
gests that autoshaped pecking is insensitive to its con- 
sequences, However, while responding on automain- 
tenance procedures is maintained at levels of 80-120 
pecks per min, responding on the omission procedure 
is maintained at substantially lower levels, often only 
15-30 responses per min (Schwartz Q. Williams, 1972a, 
1972b), This discrepancy suggests that the responsc- 
reinforcer dependency does exert control over re- 
sponding, though it is secondary te the centrel exerted 
by the stimulusrcinforcer dependency. Williams and 
William (1969) examined this possibility. A response 
key was illuminated for 6-sec trials and followed by 
food unless a kcy peck occurred (omission). A second 
response key was illuminated whenever the first one 
was. Backs on this kéy had no programmed conse- 
quences. Pigeons quickly learned te peck exclusively at 
this sccond kcy, and thus obtain nearly all of the 
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scheduled reinforcement. From this result, Williams 
and Williams argued that the key pecking which oc- 
curred on the omission procedure was sensitive to its 
consequences. Unfortunately, the design of the Wil- 
hams and Williams experiment permits an alternative 
interpretation. Let us assume that on each trial, the 
pigeon looks at only one key, and that the pigeon is 
likely to peck at the key it looks at. With this assump- 
tion, one can explain the Williams and Williams re- 
sults in purely Pavlovian terms. Despite the fact that 
both keys are simultaneously illuminated, if the 
pigeon only looks at one key per trial, it is experiencing 
automaintenance trials and omission trials separately. 
Automaintenance trials always terminate with food. 
Omission trials terminate with food only if the pigeon 
does not peck the key. Since the pigeon does peck the 
key, there are fewer omission key-food pairings than 
automaintcnance key-food pairings. Moreover, the 
omission key-food pairings are effectively on a partial 
reinforcement schedule, which often weakens Pav- 
lovian conditioning. There might thus be stimulus- 
réinforcer relations of different strengths between each 
of the keys and food, and this might account for the 
observed difference in levels of responding. 

‘To test this account of the Williams and Williams 
study, Schwartz and Willams (1972a) did an experi- 
ment in which the frequency of automaintenance key- 
food pairings and omission key-food pairings was kept 
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Fig. 3. Percentage of trials with 
at least one response and re- 
sponses per minute throughout 
the experiment, averaged across 
all eight subjects in four-session 
blocks and separated according 
to key color. To the left of the 
dotted vertical lines, the red key 
was associated with the omission 
condition, the white key with 
the yoked control. To the right 
of the dotted vertical lines, the 
Significance of the key colors 
was reversed. (From Schwartz 
& Williams, 1972a. © 1972 by 
the Society for the Experimental 
Analysis of Behavior.) 
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equal. ‘The two types of pairings occurred on separate 
trials indicated by different key colors. If an omission 
trial terminated with food, a subsequent, yoked, auto- 
maintenance trial also terminated with food. If the 
pigeon pecked the key during the omission trial and 
prevented food, food was not delivered at the end of 
the yoked automaintenance trial. In this way, the two 
types of trials differed only in the relation between 
key pecks and food. Under these conditions pigeons 
pecked the automaintenance key on many more trials 
and at twice as high a rate as they pecked the omission 
key. The data are presented in Figure 5. From this, 
Schwartz and Williams concluded that autoshaped 
responding was to some degree sensitive to its conse- 
quences. 

Some more recent research on the omission effect 
has examined in more detail the behaviors which are 
actually conditioned by the contingency relation be- 
tween the key light and food. Wessells (1974) observed 
that what was conditioned in addition to pecking was 
orientation toward and approach to the response key. 
When a negative contingency was established between 
any approach to the key and food, Wessells observed 
the gradual elimination of approach behavior. The 
response elimination required a good deal of experi- 
ence. After about 10 sessions, the subjects were still 
approaching the key on 30-40% of the trials. Nevers 
theless, what is of prime interest in this context is that 
approach was eliminated by the negative response- 
reinforcer contingency. It should be noted, in addi- 
tion, that orientation toward the key continued at 
high frequency even after approach had been elimi- 
nated. A similar study by Browne and Peden (see 
Hearst & Jenkins, 1974) produced somewhat different 
results. Imposition of a negative contingency between 
approach and food never completely eliminated ap- 
proach behavior, After 35 sessions, subjects were still 
responding on 12-60% of the trials. In another study, 
Barrera (1974) observed that key pecking occurred at 
a lower rate under a negative contingency than under 
standard automaintenance conditions. However, he 
observed that pecking per se was occurring at as high 
a rate as ever. [he effect of the negative contingency 
was to displace pecking off the key. Interestingly, 
Dunham, Mariner, and Adams (1969) observed the 
same phenomenon when key pecks were punished 
with electric shock. 

‘There is more evidence that responses which occur 
in the face of a negative response-reinforcer contin- 
gency are nevertheless sensitive to their consequences. 
Morrison (1974) studied automaintenance in the 
pigeon with water as a reinforcer. In conducting the 
research, he noticed that in addition to pecking 
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the response key, pigeons also “bowed” and “‘rooted” 
when the signal was present. Morrison then syste- 
matically examined the effects of a negative response- 
reinforcer contingency on each of these behaviors. ‘The 
general result was that a negative contingency sup- 
pressed, but did not eliminate, the target behavior, 
with the result that each of the other behaviors in- 
creased in frequency. When a negative contingency 
was established simultaneously between each of the 
behaviors and food, the pigeons emitted all of the be- 
haviors rather than suppressing them all. Murray 
(1974) recently observed the same sort of effects in a 
study of Siamese fighting fish (Betta splendens). The 
reinforcer was presentation of a mirror, which reliably 
produces display in the fish. Murray observed four dis- 
tinct behaviors conditioned to the signal. When a 
negative contingency was imposed between any of the 
behaviors and the mirror, that behavior was sup- 
pressed, while the others continued to occur. Like 
Morrison’s observations, Murray observed that while 
the negative contingency reduced the behavior, in all 
cases it failed to eliminate the behavior. 

The study by Murray is the first one we have dis- 
cussed which examined the omission procédure itt a 
species other than the pigeon. Actually, the emission 
procedure has also been studied with rats, chicks, and 
squirrel monkeys, and not always yielded the same 
results. Wasserman (1973a) studied chicks with heat 
as the reinforcer. Pecking and nuzzling the response 
key were maintained despite the negative contingency, 
Stiers and Silberberg (1974) observed maintenance of 
responding in rats in the face of a negative contin- 
gency. The signal for food was the insertion in the 
chamber of a retractable lever. ‘he response was not 
lever pressing, but lever contact, typically with the 
mouth or vibrissae. Interestingly, Stiers and Silber- 
berg observed licking, pawing, and biting in an auto- 
maintenance procedure, but nose contact of the lever 
on the omission procedure. Finally, Schwam and 
Gamzu (1975) studied the omission procedure with 
squirrel monkeys as subjects. Across different index 
responses, the uniform result was that responding was 
not maintained in the face of the negative contin- 
gency. Schwam and Gamzu explained the discrepancy 
between the squirrel monkey and the pigeon on omis- 
sion procedures by arguing that the omission effect 
occurs only in species for which there is a relatively 
rigid pattern of reinforcer-appropriate consummatory 
activity. In these cases the stimulus-reinforcer associa- 
tion determines the response which will occur. On the 
other hand, in species which have a large repertoire 
of consummatory activities (e.g., in the squirrel mon- 
key, biting, mouthing, pawing, licking, etc.) there is 
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no inflexible link between any one of these activities 
and food, so that any of them can be eliminated by 
the negative response-reinforcer relation. 


STIMULUS-REINFORCER AND 
RESPONSE-REINFORCER INTERACTION 


In the preceding discussion, we have shown that 
both stimulus-reinforcer and response-reinforcer rela- 
tions exert control over behavior in automaintenance 
and omission studies. In automaintenance the two 
types of relation may be mutually facilitative. The 
stimulus-reinforcer relation generates pecking, which 
may then be adventitiously reinforced with food. 
While adventitious reinforcement is not itself sufh- 
cient to maintain responding, it may make a substan- 
tial contribution to the high rates at which pecking 
occurs. In omission training, the stimulus-reinforcer 
and response-reinforcer relations are antagonistic. 
While the stimulus-reinforcer relation generates peck- 
ing, the response-reinforcer relation works, with only 
partial success, to eliminate pecking. Indeed, it is the 
lack of success of the negative response-rcinforcer 
contingency which has generated such interest in the 
phenomenon. It is a tribute te the remarkable power 
of the stimulusreinforcer relation that it can compete 
effectively with a response-reinforcer contingency for 
control of a behavior which is ordinarily quite sensi- 
tive to its consequences. We would like to be able to 
specify the variables which determine the relative 
eontributions of thésé two types of contingency to the 
eutcome of omission training and automaintenance 
studies. Unfortunately, there is little evidence on this 
point. One systematic investigation of the problem is 
a study by Williams (1974), The study involved four 
groups of pigeons. One group was exposed to a vari« 
ant of the omission procedure. The response key was 
illuminated periddically, and if thé pigéon did not 
peck the key for 6 sec, the key light was extinguished 
and food was presented. Each time the pigeon pecked 
the key, the trial was restarted. Thus a trial did not 
terminate until 6 sec without a peck had elapsed, and 
every trial terminated with food. Conditions for a 
second group were determined by conditions obtained 
by the first group. Trials were identical in length. In 
this group, however, there waS no programmed rela- 
tion between responding and trial duration or food. 
We shall call this proup the “yoked omission” proup. 
A third group was exposed to a discrete trials DRL 
(differential reinforcement of low rate) procedure. 
When the key was illuminated, the pigeons had to 
peck the key after 6 sec elapsed to obtain food. Pre- 
mature pecks reset the DRL timer and prolonged the 
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trial. Each trial terminated with food. Finally, the 
fourth group was yoked to this DRL group. Re 
sponses had no programmed consequences, and trial 
duration was determined by obtained trial duration 
in the DRL group. Each of the pigeons in the non- 
yoked groups was exposed to both DRL and omission 
procedures, in counterbalanced order. Each of the 
pigeons in the yoked groups was exposed to both 
yoked procedures, in counterbalanced order. Each 
group contained eight pigeons. 

What prediction might one make about the levels 
of responding maintained in the four groups? Let us 
consider first the response-reinforcer contingency. The 
DRL group is exposed to a positive contingency, the 
omission group to a negative contingency, and the two 
yoked control groups to no contingency (except per- 
haps an adventitious one). On this basis alone, we 
would expect responding to be strongest in the DRL 
group and weakest in the omission group, with the 
other two groups somewhere in between. 

Now let us consider the stimulus-reinforcer rela- 
tion. As we have already mentioned above, increases 
in the trial/ITI ratio (i.c., increases in trial length 
with ITI constant) reduce the level of responding 
maintained on autoshaping procedures (Baldock, 
1974). We would therefore expect that responding 
controlled by the stimulus-reinforcer relation in these 
groups will be inversely related to the trial duration 
obtained. Since the DRL should maintain more re- 
sponding than the omission procedure, DRL. trials 
should be longer than omission trials. Hence stimulus- 
reinforcer control should be weaker in the DRL and 
DRL-yoked groups than in the other two groups. 
Since trial durations are equal in experimental groups 
and their yoked partners, we would expect no differ- 
ence in stimulus-reinforcer control between each ex- 
perimental group and its yoked partner. 

The results of the experiment, in asymptotic re- 
sponses per minute, were as follows: DRL—14.0; omis- 
sion—4.0; DRL-yoked—6.2; omission-yoked—25.0. Thus 
all predictions were confirmed. Omission-yoked pi- 
geons responded more than DRL-yoked pigeons (equal 
response-reinforcer contingencies but longer trials for 
the DRL-yoked group); DRL pigeons responded more 
than omission pigeons (difference in the response- 
reinforcer dependency); DRL pigeons responded more 
than their yoked partners, and omission pigeons re- 
sponded less than their yoked partners (in both cases 
as a result of equal trial durations but different 
response-reinforcer dependencies). The data provide 
clear evidence that stimulus-reinforcer and response- 
reinforcer relations are both importantly involved in 
the control of responding by these procedures. 
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Are the Behaviors Influenced by 
Stimulus-Reinforcer/Response-Reinforcer 


Relations Different? 


Throughout the discussion thus far we have been 
assuming that the pecking which is influenced by 
stimulus-reinforcer relations and occurs on automain- 
tenance and omission procedures belongs to the same 
class as the pecking which is maintained by response- 
reinforcer relations and occurs in standard free op- 
erant procedures. It is possible, however, that the two 
types of contingency control two different types of 
peck which are as independent of one another as 
salivation and panel pushing in the dog. Schwartz and 
Williams (1972b) conducted a series of experiments 
designed to explore this possibility. ‘The property of 
key pecks they measured was their duration, since 
Wolin (1968) had previously observed that key peck 
duration was influenced by the nature of the rein- 
forcer, either food or water, Schwartz and Williams 
found that key pecks maintained on the omission pro- 
cedure were of almost uniformly short duration—less 
than 20 msec. Distributions of response duration on 
standard fixed-interval and fixed-ratio schedules also 
included short duration pecks, but the majority of 
pecks were long in duration—greater than 40 msec. 
This suggested that there might indeed be two differ- 
ent classes of key peck, one of which (short-duration) 
was reflexive, controlled by stimulus-reinforcer con- 
tingencies and representing the dominant response on 
omission procedures, while the other (long duration) 
was nonreflexive, sensitive to response-reinforcer con- 
tingencies and representing the dominant response on 
free operant reinforcement schedules. To test this pos- 
sibility, Schwartz and Williams tried to differentially 
reinforce both short- and long-duration responses. The 
rationale behind the study was this: short-duration 
pecks, if insensitive to their consequences, would not 
increase in frequency when differentially reinforced, 
while long-duration pecks, if sensitive to their conse- 
quences, would increase in frequency if differentially 
reinforced. The results obtained by Schwartz and 
Williams supported this hypothesis. However, other 
investigations (cf. Moore, 1973) have failed to find 
duration differences across different procedures and 
have offered alternative interpretations for the data 
observed by Schwartz and Williams. ‘Thus the argu- 
ment for two kinds of key peck must be taken as 
tentative. 

If there are two different kinds of key peck, each 
sensitive to different variables, how is one to explain 
the apparent interaction of response-reinforcer and 
stimulus-reinforcer contingencies which the research 
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reviewed above suggests? There are two possibilities. 
One is that the apparent interaction is not an inter- 
action at all. Instead, stimulus-reinforcer relations 
control short-duration pecks and response-reinforcer 
relations control long-duration pecks, and the “inter- 
action” simply results from the fact that in most stud- 
ies both kinds of key peck are lumped together as 
instances of switch closure. A second and more in- 
triguing possibility is that the two kinds of peck do 
interact, but only indirectly. This possibility has been 
discussed by Gamzu (1971) and by Schwartz and Wil- 
liams (1972b). We might call it the “minimal-unit”’ 
hypothesis. Briefly, the argument is this: short-duration 
pecks are generated and controlled by stimulus- 
reinforcer relations. Moreover, they comprise the basic 
biological units out of which long-duration, operant 
pecks develop. Furthermore, the long-duration pecks 
continue to depend for their occurrence upon the 
simultaneous occurrence of short-duration pecks. Thus 
the operant key peck may be viewed as an “anaclitic 
operant” (Kimble and Perlmutter, 1970), since it is 
built from, and depends upon, the members of a 
different response class. From this account, the inter- 
action between stimulus-reinforcer and response- 
reinforcer relations reflects the dependence of reflexive 
pecks on the former, the dependence of operant pecks 
on reflexive pecks, and, finally, the sensitivity of oper- 
ant pecks to their consequences. It should be noted that 
this account is almost entirely unsupported, and at 
present it raises more questions than it answers. How, 
for example, do operant pecks develop out of re- 
flexive pecks? This question is merely a specific re- 
statement of a question which has haunted experimen- 
tal psychology since its inception: how does voluntary 
behavior emerge out of the collection of infantile re- 
flexes? Also, one might wonder whether enough ex- 
perience with operant contingencies eventually frees 
the operant peck from its reflexive origins. These and 
other questions require empirical investigation. For 
the present, let us discuss one finding which offers 
some support for the minimal-unit hypothesis. 

We discussed above, in the section which addressed 
the conditions necessary for the acquisition of peck- 
ing in autoshaping procedures, a series of studies by 
Gamzu and Williams (1971, 1973). The reader will 
recall that when food presentation was as likely dur- 
ing the intertrial interval as during the trial, key 
pecking either did not develop or, if already devel- 
oped, was eliminated. Gamzu and Williams observed 
that pigeons initially exposed to a differential pro- 
cedure (food presented only during the trial) would 
peck the key at least 60 times per min. However, they 
also observed the curious phenomenon that if the 
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pigeons were initially exposed to a nondifferential 
procedure (food equally likely during trial and ITT) 
during which they did not peck the key and were sub- 
sequently exposed to a differential procedure, key 
pecking developed at a normal rate but reached a 
much lower asymptotic level. ‘This effect persisted 
seemingly indefinitely. The explanation offered by 
Gamzu (1971) was that during the nondifferential 
procedure a feeding-related behavior other than key 
pecking occurs and is maintained (Staddon and Sim- 
melhag, 1971). When the differential procedure 1s 
introduced and key pecking develops, the other be- 
havior continues to occur and to be reinforced. Thus 
the development of operant pecks 1s essentially blocked 
by the occurrence of these adventitiously reinforced 
ether behaviors, and the pecking one does observe 18 
strictly under the control of the stimulus-reinforcer 
relation. The implication of this account is that the 
duration of key pecks which occur under these con- 
ditions should be almost exclusively short. In an un- 
published portion of bis doctoral dissertation Gamzu 
(1971) measured response durations. In Figure 6 
durations are presented for a pipeon exposed to the 
differential procedure after the nondiflerential pro- 
cedure. It can be seen that both carly and late in 
training, response durations are exclusively short. 
Contrast this with the data in Figure 7 for a pigeon 
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exposed to the differential procedure from the outset. 
Here response durations are short early in training, 
but by later sessions there are substantial numbers of 
long-duration responses. ‘These data support the views 
that (a) there are two different kinds of key peck, 
identifiable on the basis of duration; (b) the two types 
of peck are controlled by different variables; and (c) 
short-duration pecks seem to occur initially while 
long-duration pecks only develop with experience. 
Moreover, the fact that a procedural shift from differ- 
ential to nondifferential conditions eliminates pecking 
entirely, despite the fact that only the stimulus-rein- 
forcer relation is changed, suggests that instrumental 
responses are indirectly controlled by stimulus-rein- 
forcer contingencies. 


Autoshaping and Automaintenance: 
Theoretical Analysis 


In the sections above we have suggested that auto- 
shaping is best described as Pavlovian conditioning 
and that automaintenance entails the joint action of 
Pavlovian and operant contingencies. There have been 
two attempts to capture the autoshaping literature 
theoretically, one by Hearst and Jenkins (1974) and 
one by Williams (1974). 

Hearst and Jenkins tréat autoshaping and auto- 
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Fig. 6. Relative-frequency histo- 
gram of response durations of 
a single pigeon during two ses- 
sions of the differential proce- 
dure. Prior to the introduction 
of this procedure the pigeon 
was exposed to the nondiffer- 
ential procedure for 14 days. 
(From Gamzu, 1971.) 
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Fig. 7. Relative-frequency histogram of response durations of a 
single pigeon during two sessions of the differential procedure, 
which was the first procedure to which the bird was exposed. 
(From Gamzu, 1971.) 


maintenance as an instance of sign tracking, which is 
defined as “behavior that is directed toward or away 
from a stimulus as a result of the relation between 
that stimulus and the reinforcer or between that stim- 
ulus and the absence of the reinforcer” (p. 4). They 
suggest that sign tracking is a general phenomenon 
and may contribute substantially to many observa- 
tions in the study of discrimination learning. Our 
primary concern here, however, is with sign tracking 
as an account of autoshaping. 

In Hearst and Jenkins’s view, as long as a stimulus 
can be localized, so that behavior can be directed 
toward or away from it, such behavior will develop as 
a function of the relation between that stimulus and 
a reinforcer (US). The importance of localizability of 
the CS has been indicated by Wasserman (1973b). One 
sroup of pigeons was exposed to an autoshaping pro- 
cedure with the houselight illuminated. A second 
group was exposed to the same procedure but with the 
houselight off. Only the first group acquired key peck- 
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ing. Wasserman’s explanation is that in a dark 
chamber the key light provides general illumination 
which can be seen anywhere, i.e., it is a nonlocalized 
CS. On the other hand, when the chamber is lit, the 
pigeon must look at the key to see the stimulus 
change, i.e., the CS is localized. 

Thus the sign-tracking view places most of its 
emphasis on the stimulus side of the phenomenon. It 
does not explain why pigeons peck the key rather than 
engage in some other directed behavior. Hearst and 
Jenkins (1974) acknowledge the fact that in almost all 
autoshaping studies the conditioned response is a 
component of the unconditioned response to the 
reinforcer. However, the implication of their account 
is that if the signaling stimulus were somehow inap- 
propriate for directed consummatory activity, some 
other activity (e.g., approach) would nevertheless be 
directed at it. In the standard pigeon autoshaping 
experiment, there is no way to test this view. The CS 
is typically response key illumination, and thus is in 
many ways an ideal stimulus for pecking (Cruze, 1933, 
Fantz, 1957; Hunt & Smith, 1967; Padilla, 1935). 
There are some autoshaping studies, however, in 
which tones rather than lights were used as signals. 
Gamzu (1968) and Schwartz (1973a) both failed to ob- 
serve pecking at the tone source in such experiments. 
More significantly, the pigeons in these studies also 
failed to reliably approach the tone. After a briet 
period of orientation to the tone early in the experi- 
ments (owing, presumably, to its novelty), orientation 
ceased, and in later tone presentations pigeons simply 
moved to the feeder. Jenkins (in Hearst & Jenkins, 
1974) did manage to condition approach to a tone 
source and even conditioned pecking, when the tone 
was localized behind a continuously illuminated 
perforated hemisphere. ‘Thus the efficacy of a tone as 
a signal is still debatable. What is clear is that a tone 
is far less effective than a key light. 

In our view, the sign-tracking account of auto- 
shaped key pecking places too much emphasis on the 
key and not enough emphasis on pecking. ‘The auto- 
shaping phenomenon raises two questions: Why does 
the pigeon peck, and why does the pigeon peck the 
key? Hearst and Jenkins may have provided a satisfac- 
tory answer to the latter question. However, the an- 
swer to the former question is different. Pecking is 
conditioned because it is the central component of 
the feeding pattern of the pigeon. As Staddon and 
Simmelhag (1971) have shown, pecking is observed in 
pigeons when food is presented at regular intervals 
with no signal. Thus it is the mere presentation of 
food which engenders pecking. The key light in the 
autoshaping procedure directs pecks but does not gen- 
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Fig. 8. Schematie outline of the eritieal events underlying bicon- 
ditional behavior. See text for details. (From Williams, 1974.) 


cratc them. Morcovcr, the nature of the reinforcer 
(US) in an experiment determines not only what be- 
havior will occur, but also which conditional stimuli 
will be maximally effective. Both Shettleworth (1972b) 
with chicks and Foree and LoLordg (1973) with 
pigcons have shown that visual CSs dominate auditory 
ones only when food rather than electric shock is the 
US, 

A second and somewhat more comprehensive ac- 
count of autoshaping and automaintenance has been 
provided by Williams (1974). Autoshaped pecking, 
to Williams, is an instance of “biconditional behav- 
ior.” This term refers to the fact that both stimulus= 
reinforcer and response-reinforcer contingencies play 
significant rolés in controlling key péckiiig. Willams’ 
conceptual scheme is presented in Figure 8, where §$ 
designates a stimulus, $* the reinforcing event, R* the 
consummatory response, and R the conditioned bes 
havior: A, refers to the associative link between 5 and 
S*_ Ac we discussed above, the formation of an associ- 
ation séems to require a dilleréntial predictive rélation 
between $ and $*, and m this respect autoshaping 
does not differ from standard Pavlovian conditioning. 
Precise specification of the conditions necessary for 
the formation of an association is a problem Williams 
refers to as the “psychophysics of association.” It is a 
general problem of definition and specification of 
contingency spaces and has already been discussed 
above. The main point, again, is that autoshaping is 
in no sense unique in terms of the nature of the S—S* 
association. 
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It is just the S-S* relation which generates the be- 
havior R, and this process is labeled f(A,) in Figure 
8. Williams argues that the form which a behavior 
takes as a function of the S—S* relation is a problem 
which is quite distinct from characterizing the as- 
sociation itself. He refers to this problem, which in- 
volves the relation between R and R*, as the “biology 
of association.” In general, the answer to this question 
will be specific to the species, reward, and situation 
under investigation. Williams argues, in the case of 
autoshaping, that the R set is a collection of un- 
learned consummatory behaviors. The learning which 
occurs in autoshaping is the development of control 
over response R by a new stimulus, $, such that S$ 
becomes capable of releasing the behavior R. Thus 
Williams identifies his account as a “learned release’ 
hypothesis, Whether all Pavlovian conditioning ex- 
pcrimecnts should be viewed as Icarned release experis 
ments or whether autoshaping is unique in this regard 
is discussed in detail by Williams. As we suggested 
above, the question of what is actually conditioned in 
Pavlovian conditioning experiments has been more 
evaded than analyzed in the past. Whether stimulus 
substitution, stimulus surrogation, learned release, or 
some other label aptly characterizes most or all condi- 
tioning results cannot at present be addressed with 
any confidence. 

The feature of Williams's account which is unique 
to the autoshaping literature is the relation between 
the R set and the §* event, labeled A, and g(A,) in 
Figure 8. This of course reflects an operant contin- 
gency between response and reinforcer. Williams’s 
account incorporates the fact that once behavior 1s 
generated by an S—S* association, it is subsequently 
further enhanced by an R=S* relation. Both [(A,) 
and p(A,) feed into the R set and increase the prob- 
ability of R. Williams’s model is thus an explicit 
eltort te account fer the joint action of the two types 
of contingency in the determination of a particular 
class of behavior. The function g(A,) is a positive 
feedback loop which, once the behavior has occurred, 
will insure its continued sSecurrence. This feature of 
Williams's model 1s specific to autoshaping, as we 
have said. ‘This, however, is more historical accident 
than logical necessity. The fact that key pecking has 
for so long been studied in operant situations and has 
so clearly been shown to be sensitive to its conse- 
quences demands that an account of Pavlovian con- 
trol of pecking also include a vehicle for contro] by 
response-contingent reinforcement. It is entirely pos- 
sible that other behaviors which have been tradition- 
ally studied in Pavlovian contexts are also sensitive to 
R-S* links which are built into experimental pro- 
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cedures. ‘This possibility has simply not been sys- 
tematically explored. 


THE ROLE OF STIMULUS-REINFORCER 
RELATIONS IN THE CONTROL OF 
BEHAVIOR MAINTAINED BY 
RESPONSE-REINFORCER RELATIONS 


Having reviewed above the phenomena of auto- 
shaping, automaintenance, and omission, and having 
suggested that key pecking in these Pavlovianlike 
procedures is influenced by both stimulus-reinforcer 
and response-reinforcer relations, it seems appropriate 
to ask now whether the same kind of interaction can 
be demonstrated in operant procedures which bear 
no obvious formal relationship to Pavlovian ones. An 
understanding of the influence of autoshapinglike 
phenomena in standard operant situations is essential 
before the full significance of autoshaping for the 
experimental analysis of behavior can be evaluated. 
In the remainder of the chapter we shall explore the 
possibility that in situations in which control of 
behavior by response-reinforcer contingencies is dra- 
matic and unequivocal, stimulus-reinforcer contingen- 
cies nevertheless play a vital role. Most of the phe- 
nomena we shall discuss come from the literature on 
multiple schedules of reinforcement. We shall first 
review the phenomena observed on multiple schedules 
and the standard accounts of these phenomena. We 
shall then apply some of the principles which have 
developed out of the study of autoshaping and auto- 
maintenance to these phenomena. We shall argue that 
no account of multiple-schedule phenomena will be 
accurate unless it includes an analysis of autoshaping- 
like stimulus-reinforcer relations and that, indeed, the 
most dramatic findings in studies of multiple sched- 
ules result from the influence of these relations. 

That a prototypic operant, key pecking, can be 
influenced by Pavlovian operations does not mean 
that it must be so influenced. It is possible that, in 
standard operant situations, the contro] of behavior 
by response-reinforcer relations simply dwarfs the in- 
fluence of stimulus-reinforcer relations, even though 
certain operant conditioning procedures have Pav- 
lovian stimulus-reinforcer contingencies built into 
them. The remainder of this chapter will be con- 
cerned with assessing the influence of these Pavlovian 
contingencies on behavior which is already controlled 
by operant contingencies. This issue will be discussed 
mainly in the context of the control of behavior by 
multiple schedules of reinforcement. A multiple sched- 
ule is one “in which two or more component sched- 
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ules operate in alternation, each in the presence of a 
different stimulus” (Catania, 1968, p. 339). 

Multiple-schedule procedures are instances of what 
has traditionally been called successive discrimination. 
Conditions of reinforcement in the presence of one 
stimulus typically differ from conditions of reinforce- 
ment in the presence of a second stimulus. Of major 
interest is the extent to which the two stimuli control 
behavior appropriate to their correlated reinforce- 
ment conditions and the extent to which the com- 
ponent schedules interact. The emphasis in the study 
of multiple schedules is on maintenance of inter- 
mittently reinforced behavior rather than its acquisi- 
tion. ‘These characteristics set the study of multiple 
schedules apart from most other successive discrimina- 
tion studies. 


Interactions in Multiple Schedules 


The feature of the control of behavior by multiple 
schedules which has attracted the greatest research 
interest is the interaction of the component sched- 
ules. Suppose, for example, a pigeon is pecking a 
response key illuminated by a red light for reinforce- 
ments programmed on a variable-interval 2-min (VI 2- 
min) schedule. When the pigeon’s behavior is stable, 
the procedure is changed so that 3-min periods of red 
key illumination alternate with 3-min periods of 
green key illumination. The same VI 2-min schedule 
is in effect in the presence of both key colors. This 
procedure is defined as a multiple VI 2-min VI 2-min 
schedule (mult VI 2-min VI 2-min), It differs from the 
preceding only in that there are two alternating stim- 
uli instead of one, Suppose responding to the red key 
in the mult differs in some way from responding in 
the previous procedure when the key was always red. 
This difference would be presumed to result from an 
interaction between the two component schedules. 
Indeed, there is evidence that responding is main- 
tained at a higher rate by a simple VI schedule than 
by a mult VI VI, in which the value of the VI sched- 
ules is the same as in the simple VI (Bloomfield, 
1967). 

Now suppose the behavior of our pigeon in the 
mult VI 2-min VI 2-min schedule has stabilized. The 
procedure is then changed to mult VI 2-min extinc- 
tion (EXT). When the key is red, the same VI 2-min 
schedule is in effect as before, but when the key is 
green, no reinforcement is scheduled. The effects of 
this procedural change are twofold. First, responding 
on the green key decreases. This effect is not consid- 
ered the product of an interaction, since the schedule 
on the green key has changed from VI to EXT. Sec- 
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ond, responding on the red key increases. ‘This in- 
crease must be attributed to an interaction between 
the two components of the multiple schedule, since 
the reinforcement schedule in the presence of the red 
stimulus has not been altered. This particular type of 
interaction is called positive behavioral contrast 
(Reynolds, 1961a), and it will command most of our 
attention in the remainder of this chapter. Before 
actually reviewing research on interactions in multt- 
ple schedules, however, we shall discuss some general 
issues regarding the terminology and measurement of 
interactions. 

Interactions between components of a multiple 
schedule can never be directly assessed, If one is inter- 
ested in the effects of component 6 of a multiple 
schedule on responding in component 4, then one 
alters conditions in component 6, keeps conditions in 
camponent A constant, and looks for changes in re- 
sponding in component d. 

If responding in component A changes when com- 
ponent 2 conditions are changed, then one might 
infer that B has been influcncing 4 all along. Thus 
assascments of interaction in multiple schedules typi- 
cally require within-subject, across-procedure compari- 
son. Usually one component schedule remains constant 
from one procedure to the next while the other 
component changes, While interactions probably oc- 
cur on all multiple-schedule procedures, an inter- 
action can only be unequivocally demonstrated and 
categorized in a multiple-schedule component which 
is unchanged. While the strategy for demonstrating 
interactions seems straightforward, there are some 
complexities. Suppose pigcons are exposed first to a 
mult VI 1-min VI 1-min schedule, and then to a mult 
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VI I-min EXT schedule. Responding in the un- 
changed VI component increases (positive contrast). 
Is this a clear indication of schedule interaction? It 
certainly seems to be, since the VI component has not 
changed while the behavior in that component has 
changed. However, what has also changed is the 
amount of exposure to the procedure. It is possible 
that responding changes merely as a function of ses- 
sions of exposure to a procedure. Thus in order 
clearly to identify the effect as an interaction, it must 
be shown that if the pigeons are returned to the mult 
VI I-min VI I-min procedure, responding in the 
unchanged component will decrease and return to its 
initial levels. In short, in order to demonstrate com- 
ponent interaction in a multiple schedule, the inter- 
action effect must be reversible, 1.6, the base line 
against which interactions are assessed must be re- 
covevable. Many studies which have purported to 
demonstrate schedule interaction have failed to satisfy 
this baseline recovery criterion (cf. Gonzalez & Champ- 
lin, 1974), 

To summarize, the ideal procedure for demon- 
strating multiple-schedule interactions contains three 
stages: first, exposure to a mult with equal compo- 
nents until behavior is stabilized:\second, alteration 
of one component schedule; and third, return to the 
first procedure to recover the baseline. 


Types of Interaction 


There are four possible types of schedule inter- 
action; positive and negative contrast and positive and 
negative induction. Part A of Figure 9 presents sche- 
matic diagrams of positive contrast and negative in- 


Fig. 9. Schematic diagrams of the four types 
of bchayioral interaction. In A, after be- 
havior has stabilized on a multi VI 3-min VI 
Jemin, the schedule associated with the sec- 
ond component is changed to EXT and 
response rate decreases (dashed line). The 
upper portion of the figure demonstrates 
positive contrast (shaded area). The lower 
portion of A demonstrates negative induction 
(shaded area). In B, after stabilization on 
mult VI 3-min VI 3-min, the schedule in the 
second component is changed to VI 1-min, 
ic., more reinforcements are available per 
unit time (solid line). The shaded area in 
the upper portion of B is an example of 
negative contrast, while the shaded area in 
the lower portion of B is an instance of 
positive induction. Note that in all four 
cases the schedule in the first component 
does not change, nor does reinforcement 
density. Thus these changes are the result of 
interactions with the adjacent schedule. 


PROCEDURE 


Barry Schwartz and Elkan Gamzu 


duction. Positive contrast is defined as an increase in 
responding in an unchanged component of a multiple 
schedule with decreases in responding in the other 
component. Negative induction is defined as a de- 
crease in responding in an unchanged component of 
a multiple schedule with decreases in responding in 
the other component. Positive induction and negative 
contrast are diagramed in part B of Figure 9. Positive 
inductton is an increase in responding in an un- 
changed component of a multiple schedule with in- 
creases in the other component, while negative 
contrast is a decrease in responding in an unchanged 
component of a multiple schedule with increases in 
the other component. ‘These four types of interaction 
are not independent. In any particular experimental 
manipulation only two types of interaction are pos- 
sible. Decreases in responding in the changed compo- 
nent of a multiple schedule can produce either 
increases (positive contrast) or decreases (negative in- 
duction) in the other component. Similarly, increases 
in responding in the changed component can produce 
either decreases (negative contrast) or increases (posi- 
tive induction) in the unchanged component. It 
should be noted that all four types of interaction are 
defined in terms of response rate rather than some 
other feature of the experiment. This is meant only to 
be descriptive. It does not imply that response rate 
changes in one component cause rate changes in the 
other component. We shall see below that while some 
investigators have argued this position, the matter is 
still quite controversial. The definitions given here 
are not méant to imply support for any particular 
causal account of schedule interaction. 

In the sections to follow we first describe reported 
instances of the different types of schedule interaction. 
The remainder of the chapter will focus almost ex- 
clusively on behavioral contrast. We discuss the differ- 
ent accounts in the literature offered to explain 
behavioral contrast. We next review some of the 
temporal properties of contrast. Next the relation 
between schedule interactions in multiple and con- 
current schedules is discussed. Finally, a new account 
of contrast, based upon the phenomena of autoshap- 
ing and automaintenance, is presented and evaluated. 


POSITIVE BEHAVIORAL CONTRAST 


The classic demonstration of positive behavioral 
contrast was Reynolds’s (1961a) experiment in which 
pigeons were exposed to a series of multiple schedules 
with two alternating components. Each component 
was in effect for 3 min during which the key was 
illuminated by a specific color. At the end of the 3 
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min the color of the key changed and the second 
component was in effect. This cycle was repeated 30 
times each session. During the first component the 
schedule of reinforcement for pecking was VI 3-min. 
Several different schedules were used in the second 
component, but we shall focus on the transition from 
VI 3-min to EXT and back to VI 3-min. 

Figure 10 shows what happens to rate of respond- 
ing in the unchanged component of these different 
schedules. (The data are approximations from Rey- 
nolds’s figures and are plotted for only three of the 
pigeons, since the rate of responding for the fourth 
pigeon did not appear to be stable.) ‘The introduction 
of the EXT component resulted in a large increase in 
responding (positive contrast) in the unchanged VI 3- 
min component. Reintroduction of the VI 3-min 
schedule in the second component reversed this effect 
and restored the original response rate. 

Many of the features of Reynolds’ experiment 
have become standard parts of investigations of posi- 
tive contrast. First, the reinforcement schedule in the 
unchanged component is typically a VI, though occa- 
sionally fixed schedules (Reynolds 1961b, 1961c), fixed- 
interval schedules (Reynolds & Catania, 1961; Stad- 
don, 1969) and DRL schedules (Reynolds, 1961b) have 
been used. Second, the reinforcement schedule in the 
changed component is typically also VI, and the 
change is to EXT. However, many other procedures 
have been used (Brethower & Reynolds, 1962; Terrace, 
1968), and one of the uncertainties in the contrast 
literature centers on what procedure changes are nec- 
essary to produce contrast. Third, usually the same 
response is required and the same reinforcer delivered 
in both components of the multiple schedule. An 
exception is a study by Scull and Westbrook (1973) 
in which key pecking was required in one component 
and bar pressing in the other. This experiment failed 
to result in positive contrast. 

Finally, the use of pigeons as subjects is a crucial 
feature of the basic demonstration of positive con- 
trast. Species differences are evident in contrast ex- 
periments, and indeed the theory of contrast that we 
shall propose predicts that this should be so. When 
rats are subjects, the results are equivocal. ‘These 
studies of contrast with rats fall into three categories 
according to the experimental results, which occasion- 
ally give evidence of contrast in rats, more often are 
equivocal, and sometimes clearly fail to find contrast 
in this species. 

The first, small, category includes studies that pro- 
vide positive evidence for contrast. Coates (1972) ex- 
posed rats to a mult VI 30-sec VI 30-sec until response 
rates had stabilized. Then the procedure was changed 
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Fig. 10. Individual response rates 
in the unchanged VI 3-min com- 
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te a mult PUN+ERXT, VI 30-scc, EXT, VI 30-3scc, in 
the first component of which not only was bar press- 
ing not reinforced but 64ch FEsHSHN86 résultéd in shock. 
Résponse rates in the VI components increased, with 
rate in the VI following PUN+EXT slightly higher 
than in the VI following regular EXT. Wilkie (1972) 
exposed four rats 16 9 mulf VI 30-s@e VI 30-88 and 
then changed thé sacand é6rriponent to EXT. In all 
four subjects response rates in the VI component in- 
creased, Henke, Allen, and Davison (1972) studied 
four rats on a mult VI lanin EXT after a mult VI 
Ismin VI I-min. and again all four subjects showed 
positiva eontrast in thé linchanged component. ‘Thus 
positive contrast can be observed in rats, 

More characteristic of the outcome of contrast 
studies using rats is the sccond category, in which 
equivocal results are obtained. Pear and Wilkic (1970) 
exposed two rals to six sessions of VI 30-sec, followed 
by a number of sessions on MIX VI 30-sec EXT. (A 
mixed schedule is just like a multiple schedule, except 
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that components are not signaled.) After 10 sessions’ 


one rat showed positive contrast, but the other showed 
negative induction to the extent that the VI 30-sec 
had to be changed to a VI 20-sec to maintain respond- 
ing. In a second experiment (Pear & Wilkie, 1971) 
eight rats were exposed to the following sequence of 


is 14 15 16 I7 18 19 


ponent of a series of multiple 
schedules for three pigeons. (Es- 
timated from Recynoelds, 1961a.) 


schedules: VI; mult VI EXT: muit VI VI. The VI 
schedule was always VI 30-sec. In the transition from 
simple VI to mult VI EXT, five of eight rats showed 
positive behavioral contrast. However, two of the 
other three rats showed negative induction. A final 
complicating factor is that the elevated VI rate in 
mult VI EXT shown by two rats did not return to 
base liane in stage III, when both components were 
equal Vis, 

Other mixed results come from a set of experiments 
employing shock. In these studies rats are first exposed 
t6 a mulf VI 30-see fixed ratio (FR) 10. Subsequently 
shock is introduced to the second component, so that 
each tenth response results in beth a food pellet and a 
sheck (Geck & Davidson, 1973; Gook & Sepinwall, 
1974; Davidson & Cook, 1969; Sepinwall, 1973). The 
initial effect of introducing shock is negative induc: 
tion (Cook & Davidson, 1973: Davidson & Cook, 1969), 
but after a few sessions there is often evidence of posi- 
tive contrast (Sepinwall, personal communication). 

The third class of rat studies either failed to find 
positive contrast or found negative induction. The 
procedures used have varied from mult VI EXT after 
simple VI (Freeman, 1971a; Jaffe, 1973; Weiss, 1971) 
or after mult VI VI (Dickinson, 1973) to mult VI VT 
(a VI or variable-time schedule is one in which rein- 


Barry Schwartz and Elkan Gamzu 


forcements are delivered at irregular intervals inde- 
pendent of responding) after either simple VI or mult 
VI VI (Freeman, 1971la; Lattal & Maxey, 1971). 


DEMONSTRATIONS OF 
NEGATIVE CONTRAST 


There are certain logical problems in evaluating 
negative contrast. Rachlin (1973) has suggested that 
positive and negative contrast are the same phenom- 
enon, but that they occur at different points in an 
experimental sequence. Consider the top of part A in 
Figure 9, which depicts positive contrast schematically. 
Suppose a third procedure, the return to VI VI after 
VI EXT, to recover base line, were added there and 
the base line recovered. Base line recovery means that 
responding in the unchanged component decreases as 
responding in the changed component increases. But 
this defines negative contrast. Thus an unambiguous 
demonstration of positive contrast entails a later 
demonstration of negative contrast, and there is a log- 
ical argument for treating them as instances of the 
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Fig. 11. Response rate to the operant key in the unchanged 
(VI 3-min) component of a mult VI 3-min VI 72-sec as a pro- 
portion of response rate in the same component in a prior mult 
VI 3-min VI 3-min. Points below the dashed line are indicative 
of negative contrast; points above the dashed line are indicative 
of positive induction. ‘The data in the left-hand panel are from 
a series of conventional multiple schedules and show clear nega- 
tive contrast. In the right-hand panel both components were 
signaled on a separate key. Responses to the second key were 
recorded but had no experimental consequences. (From Schwartz, 
1974a.) 
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same phenomenon. Investigators in the past have 
failed to notice this interdependence between the two 
types of contrast because they have explicitly designed 
experiments to look for one or the other type. 

Terrace (1968) exposed three pigeons to a mult 
VI 5-min VI 5-min and then to a mult VI 1-min VI 5- 
min. Only one pigeon showed negative contrast (de- 
creases in responding in the unchanged VI 5-min com- 
ponent). Nevin (1968) observed negative contrast in a 
mult with a VI 3-min schedule in the constant com- 
ponent and different DRO (differential reinforcement 
of other behavior) schedules in the changing com- 
ponent. There are a number of demonstrations of 
local negative contrast (contrast with temporal charac- 
teristics) which will be taken up in detail in a later 
section (e.g., Bernheim & Williams, 1967; Nevin & 
Shettleworth, 1966; Williams, 1965). ‘The only clear 
demonstration of large negative contrast effects in 
multiple schedules was obtained by Schwartz (1974a, 
1975). Pigeons were shifted from mult 3-min VI 3-min 
to mult VI 3-min VI 72-sec. Responding in the un- 
changed component is shown in Figure 11, Data 
points below the dashed horizontal line indicate nega- 
tive contrast. 


Explanations of Multiple-Schedule 
Interactions 


Contrast has received far more theoretical attention 
than induction in the literature. The reason for this 
asymmetry is probably historical. Induction effects 
follow from classical Hull-Spence discrimination 
theory (Spence, 1936), while contrast explicitly contra- 
dicts it (Allen, Capehart, & Hebert, 1969). In that 
context induction is the normal, expected effect and 
contrast 1s the surprise (cf. Bloomfield, 1969). 

From the outset, explanations of contrast have in- 
cluded some notion of inhibition (Pavlov, 1927— 
though he labeled what is now called contrast ‘“‘induc- 
tion”). Almost all of the major current alternative 
explanations of contrast on multiple schedules include 
an inhibition component (Bloomfield, 1969; Catania, 
1969; Malone & Staddon, 1973; Staddon, 1969; Ter- 
race, 1966a, 1966b, 1968, 1972). The point of conten- 
tion among the different accounts is the source of 
inhibition, not its existence. In this section we shall 
outline the different accounts of contrast and review 
the evidence which makes a distinction among them 
possible. Our treatment will roughly parallel recent 
reviews of the problem (Freeman, 1971b; Terrace, 


1972). 
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‘THE REINFORCEMENT 
FREQUENCY ACCOUNT 


Reynolds’s pioneering research on contrast (196 1a, 
1961b, 1961c, 1961d) suggested that the variable re- 
sponsible for contrast in the unaltered component of 
a multiple schedule was a change in reinforcement 
frequency in the other component. Contrast was a 
change in response rate in one component in the 
opposite direction from a change in reinforcement 
frequency in the other component. Thus increases in 
reinforcement in component B of a multiple schedule 
result in negative contrast in component 4, decreases 
in reinforcement in component B result in positive 
contrast in d, and changes in B which do not in: 
fluence reinforcement frequency will not result in 
changes in respending in d. The role of inhibition 
in this account of contrast has been most explicitly 
and panarally stated by Catania (1969), Reinforce- 
ment of ene class of behavior is argucd to have an 
inhibitory effect on all other classes of behavior. Thus 
reinforcement in one component of a multiple sched- 
ule wall inhibit responding in the othcr componcnt, 
and the procedural change from mult VI VI to mult 
VI EXT results in an ineraasa in VI résponding be- 
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THE RESPONSE SUPPRESSION Af&SUNT 


Terrace (1063a, 193k, 1966, 1978) observed that 
when a discrimination is lcarncd without crrors=i.c., 
when a simple VI schedule is gradually transformed 
into a mult VI EXT in such 4 way that virtually no 
racpdénsés aré madé to the EXT stimulus—positive 
contrast 1s not observed in the VI companent, This 
ebservation led Terrace to propose that response rate 
reduction, not reinforcement frequency reduction, is 
responsible for contrast. Note that in the errorless pro- 
cedure, response raté reduction doés not occur, since 
errors are néver made. Note also that in the most com- 
mon contrast-inducing procedure, shift frem mult VI 
VI to mul¢ VI ENT, response rate reduction and rein- 
forcement rate reduction are perfectly confounded, and 
the two explanations are not separable. In more recent 
accounts [Terrace has refined his view of contrast. Re- 
sponse reduction 1s necessary but not sufficient to 
produce contrast, Responding must be actively sUpP- 
pressed, i.e., it must be inhibited, It is the inhibition 
of responding in one component which produces con- 
trast in the other component. This view is much like 
Amsel’s account of contrastlike effects in very different 
experimental situations (1962). The inhibition of re- 
sponding produces an emotional by-product, which in 
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turn energizes responding in neighboring situations. 
Thus in Terrace’s view any manipulation which sup- 
presses responding and is demonstrably aversive will 
result in contrast in the adjacent component of a 
multiple schedule. The response suppression pro- 
cedure may, but need not, involve reinforcement re- 
duction. Similarly, reinforcement reduction may not 
involve response suppression. Thus reinforcement re- 
duction is neither necessary nor sufficient to produce 
contrast. 

A related account has been proposed by Bloomfield 
(1969). He argued that any procedure change which 
“worsens” conditions will result in contrast, and that 
any procedure which does not worsen conditions will 
not result in contrast. The major difference between 
Bloomheld’s view and Terrace’s is that Bloomfield 
looks to the antecedent conditions which produce re- 
sponse suppression tather than to the suppression it- 
self to explain contrast, This has the logical advantage 
of substituting a causal relation for Terrace’s correla- 
tional one. However, in terms of specific application, 
the two accounts are not very different. 


EXPERIMENTAL SEPARATION OF 


REINF ORGEMENT JKEDUCTION AND 
RESFONSE REDUCTION 


Experimental attempts to separate the two theories 
of contrast fall into two main classes. One class in- 
volves the reduction of reinforcement frequency with= 
oul concomitant reduction in responding. It includes 
‘Terraces research on érrorléss discrimination learning 
(1963a, 1063b, 1966a, 1966b, 1972). ‘Terrace demon- 
strated that contrast does net eccur after errorless dis- 
crimination training. Since errorless discrimination 
involves a reduction in reinforcement without a con- 
comitant reduction in responding, Terrace takes this 
as stréng support for his account of contrast. On the 
other hand, Halliday and Boakes (1974) have observed 
contrast in procedures in which response rate was not 
reduced (sce Rilling chapter 15 in this volume for 
additional contradictory evidence). The second and 
more diverse class of studies attempts to reduce rate of 
responding without reducing rate of reinforcement. In 
these procedures ‘Terrace would predict contrast while 
Reynolds would not. The results of these studies are 
mixed and open to methodological criticism (Free- 
man, 1971b). 

One class of studies which manipulates response 
rate without changing reinforcement rate involves 
switching from mult VI VI to mult VI VT (e.g., Halli- 
day & Boakes, 1972). Such procedures fail to produce 
contrast, which seems to support the reinforcement 
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reduction account, However, Terrace’s recent (1972) 
view 1s that response suppression, produced by nonre- 
inforcement or by some aversive stimulation contin- 
gent on responding, and not merely response reduction 
is the necessary condition for contrast. Response-inde- 
pendent reinforcement, while it reduces responding, 
presumably does not suppress it. Hence this class of 
studies is not decisive. 

Another class of studies reduces responding with 
punishment by electric shock. The punishment param- 
eters are chosen so as to substantially reduce re- 
sponding, but nevertheless to maintain it at a high 
enough rate so that reinforcement frequency 1s un- 
affected. Change from mult VI VI to mult VI VI+ 
punishment results in positive behavioral contrast 
(Brethower & Reynolds, 1962: Terrace. 1968), offering 
clear support for ‘Terrace’s view. 

There is finally a class of studies which reduces re- 
sponse rate by manipulating the schedule in effect 
during one of the multiple-schedule components. 
Mult VI VI is switched to mult VI DRL (differential 
reinforcement of low rates) or mult VI DRO (differen- 
tial reinforcement of other behavior). Terrace (1968) 
and Weisman (1969) have shown that as response rate 
is reduced in DRL, contrast occurs in the ether com- 
ponent. More recently. however. Boakes, Halliday. 
and Mole (1974) failed to observe contrast in a similar 
experiment, ‘The Boakes et al. study controlled very 
carefully for differences in local patterns ef reinferce- 
ment between the VI and DRL components, which 
Terrace’s study did not. In the mult VI DRO pro- 
cedures, as response rate is reduced in DRO, contrast 
is usually not observed (Boakes, Halliday & Mole, 
1974; Nevin, 1968; Nevin & Shettleworth, 1966; 
Reynolds, 1961a; but see Weisman, 1970, for a demon- 
stration of contrast under these conditions). ‘Thus the 
data from these studies are not decisively in support 
of one or the other account of contrast. 

In summary, we have discussed the major views of 
behavioral contrast in this section and have seen that 
experimental tests yield contradictory results. We 
shall see below that none of these accounts is suffi- 
cient. 


Temporal Properties of Contrast 


Our definitions and examples of contrast so far 
have dealt with session-to-session changes in rate of 
responding in the constant component. ‘These com- 
parisons are typically made in terms of overall rate 
responding (total number of responses in a component 
divided by the total session time in which they may be 
emitted). ‘There is, however, substantial evidence that 
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the pattern of responding within a multiple-schedule 
component is not constant. Response rate changes are 
often most dramatic at the beginning of a component. 
These changes, which are restricted to only a portion 
of a component, will be referred to as local contrast 
(Malone & Staddon, 1973). They have also been re- 
ferred to in the literature as transient contrast (Nevin 
& Shettleworth, 1966). However, we wish to reserve the 
latter term to describe a different phenomenon, i.e., 
that sometimes contrast dissipates with extended ex- 
posure to a procedure. 

To be consistent with the definition of contrast 
that we have already given, local contrast must be de- 
fined relative to the adjacent schedule. Indeed, with- 
out reference to the prior component, any time an FI 
(fixed-interval) or FR schedule is used one would have 
to call the resulting behavior an example of local 
contrast, since the pattern of responding maintained 
on those schedules is not constant. Consequently, an 
initial elevation followed by a lower constant response 
rate in a given component (4) will be defined as local 
positive contrast if the overall response rate in an 
immediately prior component (B) is lower than the 
overall ik ad rate in comivencut A. If the overall 
response rate in component # is greater than the ever- 
all response rate in component 4, then the local effect 
ig defined as local positive induction. Conversely, an 
initial depression if Vesponsé vaté at the bepanning of 
a component is defined as either lecal n 
trast or local negative induction, dcpentine on the 
response rate in the prior component. 

Boneau and Axelrod (1962) studied pigeons on a VI 
l-min schedule for six days and then changed the 
procedure to a mult VI I-min EXT. Each component 
lasted 60 sec. As might be expected, overall contrast 
was found. Responding in the presence of the VI 
stimulus more than doubled after the introduction of 
an extinction component. Local positive contrast was 
found in the next stage, in which instead of alternat- 
ing between VI and EXT components, Boneau and 
Axelrod introduced either the EXT stimulus or a 
time-out (TO) only every ninth component. Thus the 
VI component lasted 8 min. Response rate in the first 
minute was substantially higher than in subsequent 
I-min blocks and showed a gradual decrease until the 
block immediately preceding the EXT or TO. How- 
ever, after four sessions of this procedure, these local 
contrast effects were no longer evident. Thus Boneau 
and Axelrod had reported the first instance of tran- 
sient local contrast. It should also be noted that while 
the overall rates of responding in the VI component 
showed a slight decrease during the four days of test- 
ing, the basic overall contrast effect remained even 
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Fig. 12, Performance of onc bird on a 3-play multiple schedule 
m which a VI 8-min component alternated with cithcr a VI 
2-min S6¥ 44 EXT séhedule. Response rate in the VI 8-min (red) 
is broken down into six successive 20-sae pée¥idds and is averaged 
over three sessions (10=12). The depression and elevation in the 
first 20 scc of VI 6-min are cxamples of negative and positive 
local contrast oflests respectively, (Adapted from Nevin & 
Shettleworth, 1966. @ 1966 by the Socicty for the Experimental 
Analysis Of Behavior, Inc.) 


though the local effect had disappeared, Catania and 
Gill (1964) observed local positive contrast on a mult 
FI EXT schedule. However, unlike the Boneau and 
Axelrod data, these effects persisted for 5? successive 
daily scssions. Similarly, Arnett (1973) reported local 
contrast effects in a multi VI 3-min EXT sehéduleé for 
up to 65 sessions. 

Nevin and Shettleworth (1966) studied a procedure 
in which 2smin components of VI 8-min reinforcement 
in red were preceded by either a VI 2-min component 
(green) or an EXT’ component (white), Figure 12 pre- 
sents response rates in successive 20-sec segments of the 
VI 8-min component. Local negative contrast was ob- 
served when this component followed VI 2-min, and 
laeal positive éontrast was observed when this com- 
ponent followed EXT, 

There has not been a great deal of research on local 
contrast. It is, therefore. difficult to enumerate with 
confidence the conditions necessary to produce it. 
some generalizations can be stated briefly, however. 
Hiret, there ig Bod évidencé that local contrast effects 
increase with increases in the duration of the changed 
component (Staddon, 1969; Wilton & Clements, 
1971). These effects parallel the results of studies of 
trial and ITI duration in autoshaping procedures 
(e.g., Baldock, 1974; see p. 77, above), if the un- 
changed component is viewed as a “trial” against the 
background of the changed component, ‘These studies 
found that the rate of responding on autoshaping 
procedures increased with increases {n the ITI. 
Second, there is evidence that pigeons and rats differ 
in showing local contrast—at least local positive con- 
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trast. In pigeons both positive and negative local 
contrast are reliably observed. In rats positive local 
contrast is typically not observed, though negative 
local contrast is (Bernheim & Williams, 1967; Wil- 
liams, 1965). Third, the necessary prerequisite for 
local contrast appears to be a difference in reinforce- 
ment frequency, not response frequency, in adjacent 
components (Freeman, 1971la; Nevin & Shettleworth, 
1966; Williams, 1965). Finally, although the condi- 
tions from which local contrast arises are often iden- 
tical to the conditions which produce overall contrast, 
it is clear that the latter is not made up merely of the 
former. Overall positive contrast has been observed 
after local contrast has ceased (Boneau & Avelrod, 
1962), and local contrast has been observed in a situa- 
tion which produced overall induction (Freeman, 
1Q71a). 


Interactions in Concurrent Schedules: 
Relations Between Multiple 


and Concurrent Schedules 


Concurrent schedules are two or more schedules 
which are simultaneously in effect, each associated 
with a different response. The prototypic concurrent 
procedure has involved the study of pigeons pecking 
oné of two simultaneously illuminated response keys, 
with pecks on cach key associated with a different 
schedule (e.g., Catania, 1966). An alternative pro- 
cedure involves two keys. Pecks on the operant key 
produce reinforcement, while pecks on the changeover 
key change the schedule (and the correlated stimulus) 
on the operant key (Findley, 1958). A great deal of 
research, especially on the quantitative aspects of 
schédule control, has been done with concurrent 
schedules, and is discussed in detail by de Villiers 
(Ghapter 9 of this volume), Catania (1966), and 
Hcrrnstein (1970). Moreover, Herrnstein (1970) has 
drawn attention to the parallels between phenomena 
observed on multiple and concurrent schedules and 
suggested that performance on both types of schedules 
reduces to a common explanatory principle. It seems 
appropriate, therefore, to discuss the relations be- 
tween these schedules briefly. 

The findings obtained in studies of concurrent 
schedules can be summarized by the word matching. 
Relative rate of responding in component A (rate of 
responding in component A/rate in component dA 
plus rate in component B) matches relative rate of re- 
inforcement in that component (Catania, 1966; Herrn- 
stein, 1970). ‘This is the epitome of schedule interac- 
tion. Responding in one component varies inversely 
with reinforcement frequency in the other component. 
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If a concurrent VI VI (conc VI VI) procedure is 
shifted to a conc VI EXT, an increase in responding 
in the unchanged VI component occurs (Catania, 
1969). Similarly, if a conc VI EXT procedure is 
switched to conc VI VI, a decrease in responding in 
the unchanged VI component occurs (Catania, 1969). 
These outcomes are analogous to demonstrations of 
positive and negative behavioral contrast in multiple 
schedules. Indeed, the matching law is a general state- 
ment about schedule interactions which subsumes 
positive and negative contrast. 

The foregoing is not meant to suggest, however, 
that multiple and concurrent schedules always yield 
identical results. First, matching does not usually 
occur on multiple schedules. Rather than allocating 
responses in the two components in direct proportion 
to reinforcements in the two components, organisms 
tend to undermatch (Reynolds, 1961b). ‘The equation 
relating relative response rate to relative reinforce- 
ment rate has a slope less than 1.0 and a positive inter- 
cept. 

In a recent quantitative analysis, Herrnstein (1970) 
attempted to capture both the similarities and differ- 
ences between multiple and concurrent schedules, ‘The 
quantitative relation he proposed was 


Kr A 

PP, - 
r,t mrp + % 

where P, = rate of responding in component A; 7, = 
rate of reinforcement in component 4; rg = rate of 
reinforcement in component B; ro = rate of reinforce- 
ment for responses other than A and B (e.., groom- 
ing); and m = a constant representing the degree of 
interaction between components. In concurrent pro- 
cedures, m = 1; 1.e., interaction is maximal. Matching 
is reflected by the following equation, with m = 1: 


P, TA 
Pat+Pz  Y44+1p 


In multiple schedules, m approaches 1.0 as component 
duration decreases. 

As is apparent from Equation I, any alternative rein- 
forcement for responses other than P, will decrease P4 
(contrast). Also apparent is the prediction that as com- 
ponent duration in multiple schedules is shortened, ap- 
proximations to matching should get closer and closer, 
since m grows larger and larger. There is substantial 
empirical support for this prediction. Shimp and 
Wheatley (1971) and Todorov (1972) have shown that 
with extremely short (i.e., about 10-sec) component 
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durations, matching is obtained on multiple schedules. 
Indeed, one could view a concurrent procedure as a 
special case of a multiple procedure, where the sub- 
ject rather than the experimenter controls component 
duration. On concurrent procedures, subjects tend to 
produce short component durations (i.e., change keys 
at a high rate). Hence one observes matching with 
concurrent procedures. Killeen (1972) examined the 
possibility that multiple and concurrent procedures 
are intimately related by exposing pigeons to a con- 
current procedure and arranging for switches from 
one component to the other to produce component 
changes for other pigeons that were responding on 
multiple schedules identical in reinforcement fre- 
quency to the concurrent schedules. ‘This enabled 
Killeen to compare the control over responding 
exerted by concurrent schedules and multiple sched- 
ules which were identical in both reinforcement fre- 
quency and component duration. Killeen found that 
mult and conc pigeons allocated their responses to the 
different components identically, thus strengthening 
the view that multiple and concurrent schedules are 
not fundamentally different. However, the generality 
of Killeen’s findings has recently been questioned in 
an experiment by Silberberg and Schrot (1974). They 
pointed out that in Killeen’s study. since the concur- 
rent-schedule pigeons switched frequently between 
components, the resulting multiple schedule had short 
component durations. Both Herrnstein’s theoretical 
account and the data obtained by Shimp and Wheat- 
ley (1971) and Todorov (1972) suggest that multiple 
schedules and concurrent schedules have similar 
effects only when the multiple schedules are short. 
Thus Silberberg and Schrot asked whether the simi- 
larity between mult and conc performances on a Kil- 
leen-type procedure would persist even when the conc 
pigeons created long mult components. ‘They accom- 
plished this by introducing long changeover delays 
(CODs). A COD is a contingency which prevents rein- 
forcement of a response on one key until some amount 
of time has elapsed since the last response on the other 
key. Long CODs have been shown to decrease the like- 
lihood of switching between components in concur- 
rent procedures (Shull & Pliskoff, 1971). Thus they 
should result in lengthened components for the 
pigeons on mult procedures. Silberberg and Schrot 
found that as COD increased, thus increasing com- 
ponent duration in both conc and mult procedures, 
differences in response allocation between conc and 
mult subjects increased. ‘Thus they argued that since 
matching on concurrent procedures is independent of 
component duration, while matching on multiple pro- 
cedures depends upon component duration, there are 
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fundamental differences in the control over behavior 
exerted by the two types of procedure. 

There are other reasons for believing this to be the 
case. Despite the gross similarity between responding 
maintained by multiple and concurrent schedules, 
molecular analysis of how matching occurs on concur- 
rent schedules makes it clear that matching on con- 
current and multiple schedules is different. Rachlin 
(1973) has discussed this in detail. On concurrent 
schedules, animals distribute the time spent respond- 
ind in the two components itt proportion to the rela- 
tive reinforcement rate in those components (Baum & 
Rachlin, 1969). For example, if réinitorcémeénty im 
component d are twice as frequent as reinforcements 
in campenent B, animale will spend twite as mueh 
tums respending in sampenent d and make twice as 
many responses. What this means, however, is that 
lééal résponse rate {responses divided by the time 
available in which to make them) will be cqual in the 
two components. This cannot be true of multiple 
schedules. Since component duration in multiple 
schedules is fixed, increases in responding in com- 
penent 4 as a tunctien of decreases in reinforcement 
in componcnt 8 must result from increases in local 
response rate and not from inereases in time alloca- 
tien, 

What might be the source of this increase in local 
response rate whieh seeurs in multiple schédules? We 
shall new describe a new theory of behavioral contrast 
which attributes these extra responses to autoshaping- 
like stimulus-reinforcer relations which ore sometimes 
present in multiple schedules, 


An Additivity Theery ef Centrast 


The phenomena of autoshaping and automain- 
tenance have been thoroughly discussed above. Thé 
available syidence clearly indicates that auteshaping 
reflects the control of key pecking by Pavlovian, 
stimulus-reinforcer contingencies. One series of experi- 
ments, by Gamzu and Williams (1971, 1973), makes 
the parallel betwéén autoshaping and Pavlovian ¢on- 
ditioning particularly clear and 1S especially relevant 
to the theory of contrast we shall propose, Pavlovian 
conditioning depends upon the existence of an in- 
formative or differential relation between the CS and 
the US. Pairing is neither necessary nor sufficient 
(Rescorla, 1967). Gamzu and Williams applied this 
analysis to autoshaping by showing that unless the re- 
sponse key was a differential predictor of food 


[P(food/key) > P(food/key)], autoshaping would not 
occur and already established key pecking would 
cease. The question of primary relevance to the 
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phenomenon of contrast raised by the Gamzu and 
Williams experiments and other studies of autoshap- 
ing and automaintenance is this: given that stimulus- 
reinforcer relations control key pecking in the absence 
of response-reinforcer relations, what is their effect 
when response-reinforcer dependencies are also pres- 
ent? ‘To examine this question, let us return to the 
standard sequence of multiple schedules employed to 
demonstrate behavioral contrast. One begins typically 
with mult VI-VI. In this procedure response-reinforcer 
dependencies exist in both components. However, 
there is no differential stimulus-reinforcer relation. 
Food is equally likely in both components of the 
multiple schedule. When the procedure is changed to 
mull VI EXT, the response-reintforcer dependency 
continues in the VI component. Now, however, a 
differential stimulus-reinforeer relation is also intro- 
duced, ‘Vhe Vi-correlated stimulus predicts food while 
the EX T-correlated stimulus does not. This state of 
affairs is what generated and maintained pecking in 
the Gamzu and Williams experiments. Thus we might 
expect two sources of control of pecking to be oper- 
ative in the VI component of a mult VI EXT schedule 
(both stimulus-reinforcer and response-reinforcer rela- 
tions), while only one was operative in the preceding 
mult VI VI schedule. By assuming, for simplicity, that 
the two sources of control interact additively, one 
would expect pecking to increase in the VI component 
of a mult VI EXT schedule relative to its rate of 
occurrence on a mult VI VI. This, of course, defines 
positive contrast. hus the additivity theory of con- 
trast 1s simple; contrast occurs because a differential 
stimulus-reinforcer dependency is imposed upon an 
already existing response-reinforcer dependency, and 
thé two sources of control combine to increase the 
rate of key pecking. A number of investigators arrived 
at roughly this conclusion simultancously (Boakes, 
1975; Gamzu & Schwartz, 1973: Hemmes, 1973; Rach- 
lin, 1973: Staddo6n, 19072). ‘The theory was first artic- 
ulated in its present form by Gamzu and Schwartz 
(1972) on the basis of an experiment which extended 
the findings of Gamzu and Williams (1971, 1973) to 
procedures employing parameters akin to those em- 
pioyed on standard multiple schedules. Pigeons were 
exposed to a multiple schedule with regularly alter- 
nating components signaled by key color. ‘The com- 
ponents were 27 sec long, and reinforcements were 
always delivered independently of responses. The pro- 
cedures were either differential (mult VT 33-sec EXT) 
or nondifferential (mult VI 33-sec VT 33-sec). As 
Gamzu and Williams found with a discrete-trials 
procedure, pecking was generated and maintained on 
the differential procedure and essentially eliminated 
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on the nondifferential procedure. These data are sum- 
marized in Figure 13. Gamzu and Schwartz argued 
that it was these responses which summed with those 
maintained by response-reinforcer dependencies in 
standard multiple schedules to produce contrast. 


PREDICTIONS BASED UPON THE 


ADDITIVITY THEORY OF CONTRAST 
AND THEIR CONFIRMATION 


The central component of the additivity theory, 
stated generally, is that whenever a differential stim- 
ulus-reinforcer relation exists, that stimulus will exert 
control over some class of behavior. The obvious prob- 
lem which arises is the specification of the class of be- 
havior which will be controlled. ‘There are at least 
two views on this matter. One view suggests that the 
class of behavior which will be controlled by stimulus- 
reinforcer relations is just that class which is appro- 
priate to the reinforcer—i.e., a class of consummatory 
responses—and that these responses will be directed at 
the signaling stimulus. This view is in keeping with 
our traditional understanding of Pavlovian condition- 
ing, which is that a conditional stimulus comes to 
elicit some component(s) of the unconditional re- 
sponse to the US. The evidence in support of this 
view of autoshaping is good, as discussed earlier (cf. 
Moore, 1973). In the pigeon, pecking is a consum- 
matory response (Jenkins & Moore, 1973; Staddon & 
Simmeihag, 1971), and the form of autoshaped re- 
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sponding is appropriate to the reinforcer—food or 
water pecks with food or water reinforcers (Jenkins & 
Moore, 1973). When rats are autoshaped to contact 
rather than to press a lever, they contact the lever by 
chewing it (Peterson, Ackil, Frommer, & Hearst, 1972: 
Stiers & Silberberg, 1974). A second view, which has 
been labeled “sign tracking’’ (Hearst and Jenkins, 
1974) includes the directedness of the previous view. 
but suggests that within limits, organisms will direct 
whatever skeletal activity is possible toward a stimulus 
which signals food. The form of the response need 
not, though it might, bear any clear resemblance to 
the unconditional response to food. It is not clear 
which of these views is correct, but they make the 
same predictions about contrast in standard pigeon 
and rat experiments. In the standard multiple-sched- 
ule procedure employed with pigeons, pecking is the 
measured response (consummatory), and the discrim- 
inative stimuli are located on the response key (sign 
tracking). Thus either view of autoshaping would 
predict that stimulus-reinforcer and _ response-rein- 
forcer relations will both influence the same behavior 
—key pecking. (The difference between the two views 
is in whether one underlines “key” or “pecking” in 
“key pecking.) 

Now consider the standard multiple-schedule pro- 
cedure employed with rats. Rats press a lever (non- 
consummatory) for food. ‘The discriminative stimuli 
are located away from the lever (no sign tracking). 
Thus either view would predict that when rats are ex- 
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posed to mult VI EXT after mult VI VI (i.e., when a 
differential stimulus-reinforcer relation is introduced), 
bar pressing will not be enhanced. Indeed, to the ex- 
tent that the stimulus-reinforcer relation is effective, 
it will generate some other consummatory behavior, 
directed at the signal, which may compete with bar 
pressing. ‘his competition would result in negative 
induction of bar pressing (reduction in VI rate when 
the second component is changed from VI to EXT) 
rather than contrast. ‘hus the additivity theory of 
contrast would predict that contrast will not be ob- 
tained with bar-préssing rats as subjects, ‘There are no 
features of other theories of contrast already discussed 
whieh would lead ané td thig prédictioi. ‘The ltéta- 
ture sugecsts that, indecd, with bar-pressing rats, in: 
duction rather than eontrast is the rule (e.6., Freeman, 
1O71b). To summarize, contrast is net cxpected simply 
becausc response-reinforcer and stimulus-reinforcer 
¥élations do not influence thé same class of behavior. 
If the discriminative stimuli were located on the lever, 
then contrast might occur. The stimulus-reinforcer 
contingency would generate seme form of lever con- 
tact, perhaps biting. which would sum with the lever 
presses already being maintained by the résponsé-rein- 
forcer dependency. 

The implication of this theory of contrast is that 
the bulk of demonstrations of positive contrast are the 
result of a fortuitous procedural convention which (a) 
measures as opérants consummatory résponsés and 
(b) localizes discriminative stimuli on the response 
key. The theory predicts that if either of these pro- 
cedural features 1s altered, contrast will not appear. 
Again, other theories make no such prediction. A 
number of experiments have recently been conducted 
which put this aspect 6f the theory ts empirical test. 
Westhreek (1978) and Hemmes (1973) cxpesed 
pigeons to standard multiple schedules except that 
the required operant was treadle hopping in 
Hammes’ axperimant and bar pressing in Wastbrasdsk’s. 
Under these conditions, we would expect the pigeons 
te behave lke rats, The differential stimulus-+rein- 
forcer relation established during mulé VI EXT would 
not enhance treadle hopping or bar pressing, but 
some other behavior directed at the signaling stim- 
ulus. Thus probably negative induction, but certainly 
not positive contrast, should occur. Both studies failed 
to find contrast, and Westbrook’s study demonstrated 
large negative induction effects. Some data from West- 
brook’s study are presented in Figure 14. An especially 
interesting feature of the Westbrook study is that 
generalization tests around the stimulus associated 
with EX’T were performed, and inhibitory generaliza- 
tion gradients were obtained. ‘Thus Westbrook’s study 
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Fig, 14, Bar presses per minute for two pigeons over the last 
7 sessions of sult VI l-min VI l-min and 14 sessions of mult 
VI I-min EXT (lefthand panels) with results of a generaliza- 
tion test around the stimulus associated with EXT (S2) in the 
righthand pancls. Both negative induction and inhibitory 
control are demonstrated (From Westbrook, 1973. ‘@) 1973 by the 
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shows that inhibitory control can be established in 
such an experimental situation, but that it is not 
sufficient to produce behavioral contrast. 

A second class of relevant studies employs key peck- 
ing as the operant but removes the discriminative 
stimuli from the response key. Again, the additivity 
theory would predict that the differential stimulus- 
reinforcer contingency would control behavior, but 
that since the stimulus was off the key the behavior 
would not sum with operant pecks, and hence no con- 
trast would result. Experiments by Redford and 
Perkins (1974) and Schwartz (1974a, 1974c, 1975) 
confirm this expectation, In Schwartz’s experiments, 
pigeons were exposed to a series of multiple schedules, 
both mult VI VI and mult VI EXT. What varied 
from one set of schedules to the next was the location 
and/or modality of the discriminative stimuli. Figure 
I5 presents, for a representative pigeon, rates of re- 
sponding in both components of the series of multiple 
schedules. Each exposure to the mult VI EXT sched- 
ule was characterized by a different set of discrimina- 
tive stimuli, identified in the figure. Contrast was only 
observed when the signals for the multiple-schedule 
components were on the response key. ‘The data pre- 
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sented in the figure were characteristic of all subjects, 
except that some pigeons evidenced a small contrast 
effect with tone off as $+, This latter effect has also 
been reported by Hemmes (1973) and Westbrook 
(1973) and may require additional explanatory con- 
cepts. 

We shall conclude this section by describing a 
study by Keller (1974) which is a most elegant support 
for the additivity theory of contrast. Keller exposed 
pigeons to a standard sequence of multiple schedules 
and obtained positive contrast. He then spatially sep- 
arated response-reinforcer and stimulus-reinforcer re- 
lations. The old response key was always illuminated 
by the same stimulus, and reinforcement depended 
upon pecks on this key (operant key). The stimuli 
which signaled components of the multiple schedule 
were now alternated on a second key (signal key). 
During mult VI VI, responding was maintained as 
normal on the operant key, and there was no respond- 
ing on the signal key. During mult VI EXT, response 
rate on the operant key did not change substantially 
during VI. However, most of the pigeons now started 
pecking at the VI stimulus on the signal key. Re- 
sponding on the two keys together, if summed, showed 
behavioral contrast. 

A second study produced more impressive results. 
Pigeons were exposed to three-component multiple 
schedules of reinforcement. Pecks on the operant key 
were required for reinforcement, while the multiple- 
schedule components were signaled on a second key. 
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Fig. 15. Responses per minute 
for one pigeon exposed to a 
repeated cycle of mult VI VI 
(panels 1, 3, and 5) followed by 
mult VI EXT (panels 2, 4, and 
6). The location and modality of 
the discriminative stimuli were 
varied as indicated in cach panel, 
Contrast was only observed 
when the discriminative stimuli 
were on the key, (From Schwartz, 
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Initially, the component schedules were VI, VI, and 
EX'T. Under these conditions, pigeons pecked at both 
of the VI stimuli on the signal key. When the pro- 
cedure was changed to multi VI EXT EXT, Keller 
observed the following: (1) pecks on the signal key at 
the EX'T stimulus which was previously correlated 
with VI were substantially reduced; (2) pecks on the 
signal key at the VI stimulus were substantially in- 
creased; (3) pecks at the operant key during VI were 
substantially reduced, ie., negative induction was 
observed; (4) despite induction on the operant key, if 
operant key pecks and VI signal pecks were summed, 
the uniform result was behavioral contrast. Some of 
these data are presented in Figure 16. 


RELATIONS BETWEEN AN ADDITIVITY 
"THEORY OF CONTRAST AND 
EXISTING ‘THEORIES 


In the preceding section we outlined a new account 
of behavioral contrast. We shall now consider the 
possible relation between this theory and other 
theories of contrast, which focus on the concept of 
inhibition. Logically, the two accounts are not in- 
compatible. It is entirely possible that inhibition pro- 
duced in some way is a necessary condition for con- 
trast. I'he autoshaping theory simply asserts that 
inhibition is not sufficient. As the experiment by West- 
brook (1973), discussed in the last section, clearly 
demonstrates, one can obtain inhibitory stimulus con- 
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Fig. 16. Responses per minute during the unchanged component 
of a scrics of threc-componcnt multiple schedules. Response 
rates are segmented into responses to the opcrant Kcy and 
Yesponsées to the signal key (shaded). The multiple schedules 
dre divided into those with two VI components (regardless of 
order of presentation) and those with two EXT components. 
The data are averages of data presented in Keller C1974, Table 
2). Ii All esses the overall contrast effect can be completely 
attributed to increased responding to the signal key. 


trol without obtaining contrast. Gontrast requires 
more than inhibition. i-e.. it requires the appropriate 
choice of rasponsa and discriminative stimulus by the 
éxpérimeénter. ‘Thus the question of whether inhi 
bitien is sufficient for contrast has already been an- 
swered negatively. The unresolved question is whether 
inhibition is necessary. Current controversies regard- 
ing inhibitory theories of contrast focus upan the re. 
sponse-reinforcer relatisn. Inhibition produced by 
nonreinforced responding results in a rebound in- 
crease in the rate of reinforced responses on once vicw 
(Terrace, 1972), Inhibition of responding in one com- 
ponent of a multitude schedule by reinforcement of 
responding in the other component is eliminated in a 
shift from mult VI VI to mult VI EXT, thus increas- 
ine unchanged VI response rate on the other view 
(Catania, 1969; Reynolds, 196la). ‘The additivity 
theory of contrast suggests a different approach to 
inhibition—one which focuses on stimulus-reinforcer 
relations. It is an excitatory stimulus-reinforcer rela- 
tion which produces contrast according to the theory. 
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However, on a mult VI EXT procedure, the EXT 
stimulus resembles procedurally a Pavlovian con- 
ditioned inhibitor (Rescorla, 1969b). Thus the stan- 
dard procedural progression from mult VI VI to mult 
VI EXT inevitably introduces both excitatory and 
inhibitory Pavlovian contingencies.2 The obtained 
contrast effect could ostensibly result from either or 
both of these contingencies. The mult VI VI pro- 
cedure is the operant analog to the Pavlovian truly 
random control (Rescorla, 1967). Neither stimulus is 
a differential predictor of food, and there should be 
no Payloyian conditioning. 

Support for the possibility that the extinction 
stimulus may be a Pavlovian conditioned inhibitor 
comes from a study by Jenkins and Boakes (1973). 
Pigeons were exposed to three different stimuli on 
response keys. One stimulus reliably signaled food. 
One stimulus reliably signaled no food, Food delivery 
was random with respect to the third stimulus, Auto: 
shaped pecking occurred only to the stimulus which 
signaled food. Of interest to the present topic, how- 
ever, is that measured by their position in the chamber 
relative to the stimuli, pigeons, preferred the random 
stimulus to the stimulus which signaled no food. If 
the relative aversion to the no-food stimulus in the 
Jenkins and Boakes study is taken as an indication of 
Pavlovian conditioned inhibition, and if the no-food 
stimulus is considered the analog of an extinction 
stimulus in a multiple schedule, then the Jenkins and 
Boakes study supports the view that the EXT signal 
in a multiple schedule functions as a Pavlovian con- 
ditioned inhibitor. 

Having suggested that contrast may depend upon 
the joint action of Pavlovian excitation and inhibition 
of the samé résponse, let us considér the following 
experiment. Pigeons are exposed to a mult VI 1-min 
VI I-min schedule, then shifted to a mult VI 1-min 
VI 5-min (Guttman, 1959; Terrace, 1968). Such a 
procedure results in positive behavioral contrast. 
While if is not unreasonable to suggest that such a 


procedure shift introduces an excitatory stimulus-rein- 


2It should be noted that this feature of multiple schedules is 
an inherent feature of all Pavlovian conditioning experiments, 
If a differential positive contingency exists between a CS and a 
US, then a differential negative contingency must exist between 
the absence of the CS and the US, That this relation may some- 
times be crucial is discussed in Seligman’s (1969) criticism of 
Rescorla’s truly random control procedure. This fact is a con- 
sequence of the modern view of Pavlovian conditioning as being 
dependent on more than the mere pairing of stimuli (Rescorla, 
1967). Indeed, Pavlov (1927) expected that in standard experi- 
mental procedures background stimuli would become weak 
conditioned excitatory stimuli by virtue of being sometimes 
paired with a US. On the modern view, background stimuli 
would not be expected to be excitatory, and might even be 
inhibitory. 
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forcer relation in the unchanged VI 1-min component, 
can one argue that the change from VI 1-min to VI 
5-min in the other component results in an inhibitory 
stimulus-reinforcer relation? Neither of the two recent 
reviews of research on conditioned inhibition (Hearst, 
Besley, & Farthing, 1970; Rescorla, 1969b) present any 
relevant data from the Pavlovian conditioning litera- 
ture. One piece of experimental evidence which is 
available suggests that inhibition is not produced by 
such a procedure. In the Gamzu and Schwartz (1973) 
experiment discussed above, one sequence of pro- 
cedures involved a shift from mult VT 33-sec VT 
33-sec to mult VT 33-sec VT 100-sec. Response rate 
increased dramatically in the unchanged VT 33-sec 
component—presumably due to excitatory Pavlovian 
conditioning. However, response rate also increased 
dramatically in the VT 100-sec component relative to 
prior rate when the schedule in that component was 
VT 33-sec. This increase in rate is incompatible with 
the idea that such changes in procedure might pro- 
duce inhibition. On the other hand, a study by Weis- 
man (1969) suggests that the stimulus signaling re- 
duced reinforcement may be inhibitory. Weisman 
shifted pigeons from mult VI l-min VI 1-min to mult 
VI l-min VI 5-min and observed positive behavioral 
contrast in the unchanged VI component. Concur- 
rently, he observed gradients of inhibition about the 
stimulus correlated with the VI 5-min schedule. Also, 
a study by Rilling, Askew, Ahlskog, and Kramer 
(1969) has shown that pigeons will peck a key to 
escape from the stimulus correlated with VI 5-min on 
a mult VI 30-sec VI 5-min procedure. ‘Thus at present 
the question of whether Pavlovian inhibition is neces- 
sary for contrast to occur must be left unresolved? 


3We have been using the term inhibition somewhat loosely 
in this section, and we shall not attempt a rigorous definition of 
the term. Reviews by Rescorla (1969) and Hearst, Besley, and 
Farthing (1970) have dealt extensively with the problem of 
establishing defining criteria for the presence of inhibition. We 
have cited evidence from Jenkins and Boakes (1973) that pigeons 
withdraw from an S~—, and from Rilling et al. (1969) that 
pigeons peck a key to escape from a stimulus correlated with a 
relatively low density of reinforcement as instances of inhibitory 
control. ‘Technically, they are not. Rather, they are demonstra- 
tions of the relative aversiveness of these stimuli. The relation 
between these indications of aversiveness and inhibition is an 
open question. For the present purposes, however, the concepts 
of inhibition and aversion are functionally equivalent: they both 
imply a reduction in responding. There is still a problem, how- 
ever. How can an S— reduce responding which is already at 
zero, which is the level of Pavlovian responses we would expect 
to be present in a mult VI VI with equal-density components? A 
possible solution to this problem may arise from an evaluation 
of the stimulus-reinforcer relations which exist in the chamber 
from a broader perspective. While it is true that neither mult 
stimulus is a better predictor of food than the other, it is also 
true that both stimuli are differential predictors of food in 
contrast with the stimuli present outside the chamber. If this 
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There is one final issue to be discussed with regard 
to the additivity theory of contrast. How does such a 
theory account for negative contrast? As mentioned 
above, negative contrast refers to a decrease in re- 
sponse rate as response rate in the other component 
increases. In studies of multiple schedules, negative 
contrast is not obtained as reliably, or in as great a 
magnitude, as positive contrast. For example, Terrace 
(1968) shifted pigeons from mult VI 5-min VI 5-min 
to mult VI I-min VI 5-min and obtained little nega- 
tive contrast, a finding which he interpreted as sup- 
port for his view that contrast results from the frustra- 
tion of nonrewarded responses, or, as Bloomfield has 
described it, from a worsening of conditions. There 
are, however, demonstrations of local negative contrast 
in both rats (Bernheim & Williams, 1967) and pigeons 
(Nevin & Shettleworth, 1966). An additivity theory of 
negative contrast encounters the same problem dis- 
cussed above with regard to conditioned inhibition. 
Multiple VI 5-min VI 5-min schedules presumably re- 
sult in no Pavlovian conditioning. The shift to mult 
VI 5-min VI 1-min must make the VI 5-min stimulus a 
conditioned inhibitor in order to account for negative 
contrast. As we have seen, such an argument is prob- 
lematic. An alternative possibility is that negative and 
positive contrast are causally unrelated—that despite 
their symmetrical appearance, they are produced by 
different variables. Bernheim and Williams (1967) 
demonstrated that positive and negative contrast are 
dissociable: they appear to occur independently of 
one another. Some rats in their study showed one type 
and some the other, and the same rats showed each 
effect at different points in the experiment. One way 
to investigate this possibility more systematically is 
to assess whether manipulation of the variables which 
influence positive contrast (e.g., location of discrimina- 
tive stimuli or required response) have similar effects 
on negative contrast. Schwartz (1975) has recently 
done such an experiment. Pigeons were exposed to a 
mult VI 3-min VI 3-min schedule with the com- 
ponents signaled by green and white key color and 
then shifted to a mult VI 3-min VI 72-sec schedule. 
Absolute rates of responding in both components are 
presented for one pigeon in Figure 17. It can be seen 
from the second panel that negative contrast occurred. 
The pigeons were then returned to the mult VI 3-min 
VI 3-min schedule. Now, however, the operant key 
was always blue. A second key, the signal key, alter- 
nated between green and white. Recall that with pro- 


logical truth is also a psychological one, we would expect both 
mult stimuli to be slightly excitatory, in which case an inhibitory 
operation could be expected to reduce responding. 
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cedures designed for assessing positive contrast. no con- 
tract effect appears on the operant key (Keller, 1974). 
Rather, peckg are directed at the signal key, ‘The 
fourth panel of Figure 17 indicates that negative con- 
trast docs in fact appear on the operant key with this 
procedure. Incidentally, substantial signal-key pack. 
ing occurred during the changed component (VI 72. 
sec), This is represented at the bottom of the fourth 
pancl of Figure 17. Data from all pigcons in this cx- 
periment have already been presented earlier (see 
Figure 11). | 

‘Thus both logic and préliminary évidenceé suggest 
that the additivity theory of contrast may not be ap- 
plicable to negative contrast. This conclusion is simi- 
lar to the one suggested by studics of contrast in very 
different experimental contexts. Studies of rats in run- 
WAVE And mW1478&, typically involving variations in ré- 
ward magnitude rather than frequency, have occasion- 
ally provided evidence for both positive and negative 
contrast (respi, 1944), However, the most commen 
finding in such studies is clear negative contrast but 
no positive contrast (e.g., Bower, 1961; Glass & Ison, 
1966; Spear & Hill, 1965; see Dunham, 1968, for a re- 
view of this literature). Both the lack of positive con- 
trast and the presence of negative contrast in these 
studies is consistent with the additivity theory, since 
it suggests that (a) positive contrast is unlikely in 
most of the traditional] situations to which rats are 
exposed and (b) positive and negative contrast may be 
fundamentally different phenomena. 
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Fig. 17. Responses per minute 
for one pigeon exposed to a 
series of mult VI 3-min VI 3- 
min schedules alternating with 
mult VI 3-min VI 72-sec sched- 
ules. In panels 1 and 2 the 
discriminative stimuli were dis- 
played successively on the same 
key. In panels 3 and 4 two keys 
were used, One key was con- 
stantly illuminated blue and 
pecks at it could be reinforced 
by the prevailing VI schedule, 
the value of which was signaled 
by either green or white illumi- 
nation of a second key. Pecking 
at the second key was recorded 
but had no experimental effect. 
Responses per minute to the 
white signal key are shown at 
the bottom of the last panel, 
(From Schwartz, 1975. @ 1975 
by the Society for the Experi- 
mental Analysis of Behavior, 
Inc.) 
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ADDITIVITY THEORY AND LOcAL CONTRAST 


A prominent feature of demonstrations of contrast 
is its time dependence or local character. Much of this 
literature has already been reviewed above, Let us 
simply summarize by asserting that the magnitude of 
contrast decreasés with time after thé preceding com- 
pencnt ends and increases as a function of its dura- 
tion. 

It is possible that the distinction between local and 
overall contrast effects may help reconcile those con- 
trast phenomena which are eonsistent with the ad- 
ditiyity theory with those which are not (sée de 
Villiers, Ghapter 9 of this volume). The consistent 
phenomena have already been discussed in detail. 
Some of the inconsistencies are (1) occasional demon: 
strations Of contrast with discriminative stimuli lo- 
cated off the key (Hemmes, 1973: Schwartz, 19742, 
1974c; Westbrook, 1973); (@) occasional demonstra- 
tions of contrast in bar-pressing rats (Gutman & Sut- 
terer, 1974; Pear & Wilkie, 1971); (3) demonstration 
of contrast in rats with procedures employing only 
aversive stimuli (de Villiers, 1972, 1974); and (4) lack 
of contrast in some errorless discrimination procedures 
(Terrace, 1966). Rachlin has provided evidence which 
suggests that the contribution of a stimulus-reinforcer 
contingency to contrast may be largely restricted to 
periods just after a change in multiple-schedule com- 
ponents—i.e., that the stimulus-reinforcer effect may 
be local. Pigeons were exposed to mult VI 2-min VI 
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Fig. 18. Rates of responding of 
two pigeons in successive inter- 
vals of 8-min components of a 
mult VI 2-min VI 2-min plus VT 
15-sec. The duration of each in- 
terval increases logarithmically 
along the abscissa. For instance, 
the first points on the abscissa 
show rates of responding during 
the first 8 seconds; the last 8 
points show rates during the 
248th to 480th seconds of the 


30 120 246 460 component. (From Rachlin, 


SUCCESSIVE INTERVALS IN 8 MIN COMPONENT (SECONDS - LOG SCALE) 1973,) 


2-mim schedules. In one of the components additional 
reinforcements were delivered independent of re- 
sponding on a variable time (VT) 15-sec schedule. Let 
us consider the expected effect of these free rewards 
from the additivity theory of contrast. The free rein- 
lorcements would create a differential stimulus-rein- 
forcer contingency, which would generate additional 
pecks in that component. Thus response rate should 
be higher when free rewards are presented than when 
they are not presented. Rachlin observed that when 
the components were 8 sec long, response rate was 
higher in the component which included free rein- 
forcements. However, when the components were 8 
min long, this was no longer true. 

Figure 18 presents response rates in successive sub- 
components of the 8-min multiple procedure. Early in 
each component containing free reinforcement re- 
sponding was higher than in the other component. 
Later in each component containing free reinforce- 
ment responding was lower. Rachlin’s study suggests 
that the main effect of the stimulus-reinforcer con- 
tingency may be confined to the part of a multiple- 
schedule component which borders the. other compo- 
nent, i.e., the stimulus-reinforcer relation may be 
responsible primarily for local contrast. If components 
are very long, the local effect will be averaged across 
the entire component and represent only a small in- 


crease in oyerall response rate (no contrast). The 
shorter the component, the larger the contribution 
made by the local contrast effect to overall rate. 
Boakes, Halliday, and Poli (1975) have supported 
Rachlin’s findings. They studied a procedure very 
similar to Rachlin’s, but with 24min component dura- 
tions. They found increases in responding when free 
reinforcements were added to a component of a mul 
tiple schedule under these conditions, 

Another study which suggests that the stimulus. 
reinforcer relation contributes mainly to local con- 
trast was done by Schwartz, Hamilton, and Silberberg 
(1975). Schwartz et al. exposed pigeons to a sequence 
of multiple schedules in which the signal for the com- 
ponents was located on a second key (see Keller, 1974). 
In addition, they recorded the duration of key pecks 
on both the operant key and the signal key. The 
reader will recall that Schwartz and Williams (1972b) 
and Gamzu (1971) have obtained evidence which sug- 
gests that pecks controlled by the response-reinforcer 
relation are long-duration pecks, while pecks con- 
trolled by the stimulus-reinforcer relation are short- 
duration pecks. If the Keller procedure spatially 
separates behavior controlled by the two different con- 
tingencies, one would expect response durations on 
the signal key to be substantially shorter than response 
durations on the operant key. On the mult VI EXT 
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procedure, Schwartz et al. obtained evidence that this 
was the case. Median response durations on the op- 
erant key were about 50 msec, while median dura- 
tions on the signal key were about 20 msec. Of greater 
relevance to the present discussion, however, was the 
finding that most signal-key responses occurred dur- 
ing the first 10 sec of each 120-sec VI component. ‘The 
four pigeons emitted 14, 32, 59, and 77% of their 
signal-key responses during the first 10 sec of the VI 
component. If there were no local effect, 8.5% of 
responses would have occurred in the first 10. sec. 
These data support both the view that the stimulus- 
reinforcer contingency exerts its main effect on local 
contrast and the view that there arc two distinct 
classes of pecks which may be separated on the basis 
of duration (Schwartz & Williams, 1972b), If the ad- 
ditivity theory of contrast offered here accounts 
mainly for local contrast, then oné might expéct to 
sec contrast (though of lesser magnitude) in some 
situations to which the additivity theory does not 
apply, since not all contrast is local. Unfortunately, 
sincc only a small proportion of the studies which 
demonstrate behavioral contrast provide the evi- 
dence necessary for assessing local contrast, it is not 
possible to say anything very definitive on this mat- 
ter. For example, in studies which offer clear support 
far thé additivity theary (a0), Kallary, 19074: Schwartz, 
1975) there are no data available on whether the 
obscrved contrast cflects were local. As we mentioned 
above. local positive contrast is almost never reported 
With FAES 8g subyects, although sverall contrast 6eea- 
sionally 18, ‘his one bit of confirming evidence is a 
slender reed on which to support a theory, however, 
and until morc cxpcrimcntal analysis has been car- 
ried out the resolution of this matter will have to wait. 


Apoprtivitry THeery or 
CONTRAST; GCONGLUSIONS 


We have presented a theory of contrast which re- 
lates it t6 autoshaping and sutomaintenance and 
have disctisséd the findings which lend support to the 
theory, Much of the evidence one would like in or- 
der to evaluate the theory is not presently available. 
However, there is enough confirming evidence to in- 
dicate that stimulus-reinforcer relations play a sig- 
nificant role in the phenomenon of behavioral con- 
trast. This being the case, 1t becomes clear that the 
phenomenon of behavioral contrast has been im- 
properly labeled a “schedule interaction” (Reynolds, 
1961b). Behavioral contrast reflects not so much sched- 
ule interaction as it does process interaction. The two 
processes which interact are Pavlovian and operant 
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conditioning processes. Thus it is appropriate to 
examine some of the research explicitly designed to 
assess such process interactions in order to shed more 
light on what have been termed schedule interactions. 
Some of these studies are reviewed in the next section. 
In concluding our discussion of multiple-schedule in- 
teractions, it is important to emphasize that we view 
behavioral contrast merely as an example of how 
Pavlovian relations may control operant behavior. 
We have devoted a great deal of space to this example, 
but we expect that once investigators start looking for 
other examples the interactions discussed here will 
turn out to be pervasive. 


Interactions Between Pavlovian and 
Operant Conditioning and the 


Additivity Theory of Contrast 


The interaction between Pavlovian and operant 
conditioning processes has been a traditional concern 
in the study of animal learning, especially with regard 
to avoidance learning (Herrnstein, 1969; Rescorla & 
Solomon, 1967). Most of the research which has been 
done on this problem has been done in avoidance 
paradigms and is thus not the proper concern of this 
discussion. The research of interest here involves the 
superimposition of free food reinforcement (either 
signaled or unsignaled) on a base line of operant re- 
sponding which is itself maintained by food reintorce- 
ment. 


PosITIVE GONDITIONED SUPPRESSION 


What might one expect to be the effects of a signal 
for food delivery on operant responding? The litera- 
Pavlovian excitation will result in an increase in re- 
sponse rate (Rescorla & Solomon, 1967), a phenome- 
non we shall call conditioned enhancement. It is the 
Pavlovian alimentary analog of conditioned suppres- 
sion (Estes & Skinner, 1941). What might one expect 
on the basis of the additivity theory of contrast out- 
lined in this chapter? Its predictions are more equiv- 
ocal. A stimulus-reinforcer contingency will result in 
excitation, but whether this excitation is reflected in 
an increase in operant responding will depend upon 
(a) the spatial relation between the stimulus and the 
target for operant responses; (b) the nature of the 
operant; and (c) the relation between the Pavlovian 
US and the operant reinforcement. H, for example, 
the operant is key pecking and the CS is located on 
the key, we would expect to observe conditioned 
enhancement. If, on the other hand, the operant is 
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treadle hopping and the signal is an illuminated re- 
sponse key, we would expect to observe decreases in 
treadle hopping (called hereafter positive conditioned 
suppression to distinguish it from the more familiar 
demonstration of suppression by a CS which signals 
electric shock). It is interesting that the existing litera- 
ture has precisely the equivocal character which the 
present account would predict. For example, one 
recent experiment examined the two hypothetical 
situations just described, and indeed found condi- 
tioned enhancement of key pecking and conditioned 
suppression of treadle hopping (LoLordo, McMillan, 
& Riley, 1974). This study is described in detail be- 
low. In brief, the present theory of contrast would 
predict conditioned enhancement of responding by a 
Pavlovian contingency under just those conditions 
which would also yield positive contrast with the ap- 
propriate experimental manipulations, Those situa- 
tions which yield negative induction would be ex- 
pected to result in positive conditioned suppression 
when Pavlovian contingencies are applied to operant 
base lines. This formulation is much like one ad- 
vanced by Staddon (1972) in somewhat less detail, 
The available evidence tends to support these pre- 
dictions. Azrin and Hake (1969) studied the effects of 
Pavlovian contingencies on bar pressing in rats. Press- 
ing was maintained by either food or water, and the 
Pavlovian CS (either a 6-per-sec relay click or a 10-per- 
sec blinking light) signaled cither food, water, or 
intercranial stimulation (ICS). In cases where food 
and water were USs, they were five times as large as 
the food and water operant rewards. In all there were 
five groups of rats: three groups bar-pressed for food 
and received eithér food, water, or ICS as a Pav- 
lovian US; two groups bar-pressed for water and re- 
ceived either food or water as a Pavlovian US. All 
groups but one showed substantial suppression of bar 
pressing in the presence of the CS. The water-water 
group displayed conditioned enhancement. With the 
exception of this group, Azrin and Hake’s results con- 
firm expectations from the additivity theory of con- 
trast. However, the authors made a point of noting 
that no behaviors were observed during the CS which 
might have competed with bar pressing. This observa- 
tion presents something of a problem, since we might 
expect orientation toward and perhaps contact of the 
CS to occur and to mediate the suppression effect. Van 
Dyne (1971) observed similar suppression effects in a 
study of bar pressing in rats much like the Azrin and 
Hake study. Both Azrin and Hake and Van Dyne 
offered interpretations of their results which were 
similar to the present account. The Pavlovian con- 
tingency results in the conditioning of a respondent 
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which is incompatible with the operant. Neither 
study, however, attempted to measure the putative 
respondent. A more recent study by Kelly (1973) did 
attempt to measure respondent behavior. The lever 
pressing of monkeys was maintained on a VI schedule, 
and occasional pairings of tone and food were super- 
imposed on the operant procedure. Conditioned 
suppression of bar pressing by the tone rapidly devel- 
oped and was sustained over 275 sessions. Concurrent 
measurement of heart rate and blood pressure gave no 
indication that either of these autonomic systems was 
influenced by the Pavlovian contingency. Thus just 
as Azrin and Hake failed to observe skeletal behaviors 
which might compete with lever pressing, Kelley 
failed to observe potentially competing autonomic 
behaviors. 

There are a number of demonstrations that the 
duration of the GS influences its effects. Henton and 
Brady (1970), Meltzer and Brahlek (1970), and Miczek 
and Grossman (1971) studied the effects of CS dura- 
tion on the positive conditioned suppression effect. 
Henton and Brady superimpesed Paylevian condi- 
tioning trials on responding maintained on a DRL 
30-sec schedule in squirrel monkeys. At CS durations 
of 20 and 40 sec they observed no effect. At CS dura- 
tions of 80 gee they obsérvéd conditioned enhance- 
ment, with most of the respending eccurring carly in 
the G5. Meltzer and Brahlek exposed rats whose bar 
pressing was maintained on a VI 2?-min schedule of 
reinforcement (Noyés pellets} to pairings of a CS fol- 
lowed by access to a sucrose solution. When the G5 
was either 12 or 40 sec long they observed conditioned 
suppression. However, when the CS was 120 sec long 
they observed about a 5% enhancement of bar press- 
ing. Finally, Miczek and Grossman exposed squirrel 
monkeys to a similar procedure, in which the Pav- 
lovian and operant rewards differed. With a 30-sec CS 
they observed large suppression effects. With 1-, 2-, or 
3-min CSs, suppression still occurred, but substantially 
less. How do these effects of CS duration—i.e., the 
longer the CS, the less the suppression—relate to the 
additivity theory of contrast? According to the addi- 
tivity theory, positive conditioned suppression results 
from Pavlovian conditioning of a response which com- 
petes with the operant. Thus the less the suppression, 
the weaker the competing response. Seen in this light, 
the CS duration effects merely reflect weaker Pav- 
lovian conditioning with longer CSs. If the Pavlovian 
conditioned response were compatible with the op- 
erant (e.g., key pecking in pigeons), one would expect 
increases in CS duration to produce exactly the oppo- 
site effects. At long CS durations Pavlovian condition- 
ing would be weakened and little or no increase in 
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pecking would occur. A recent study by Smith (1974) 
provides direct support for this argument. A 5-sec CS 
on the key enhanced responding, while 30- and 60- 
second CSs either had no effect or reduced responding. 
There is also indirect evidence which supports this 
view, First, it has been demonstrated that in auto- 
shaping situations, increases in trial duration result in 
decreases in responding (Ricci, 1973). Second, Rach- 
lin’s (1973) study of the effects of component duration 
on multipleschedule performance makes a similar 
point. Recall that in Rachlin’s study free reinforce- 
ments were delivered in one component of a multiple 
schedule, both components of which were correlated 
with identical VI schedules of response-dependent 
reinforcement. Rachlin found that when the compo- 
nent duration was short (8 sec) free reinforcement in- 
creased overall response rate, while when component 
duration was long (8 min) free reinforcement de- 
creased overall response rate, Lf one views the stimuli 
correlated with the components of the multiple sched- 
ule as being Pavlovian GSs as well (as the additivity 
theory of contrast suggests). then Rachlin’s study can 
be interpreted as showing facilitation of pecking with 
a short CS, but net with a long GS. 


CONDITIONED ENHANCEMENT 


LoLords (1971) exposed pigeons to a VI 2-min 
schedule with 4-sec access to grain as the reinforcer, 
Superimposed on this were presentations of 20=sec 
Css. The Cs? was terminated in food delivery, with 
the duration of reinforcement varied from 2 to 8 sec. 
The second GS, the GS—; was correlated with no food. 
Clear enhancement of pecking developed to the CST 
as a dirAct fiinction Sf rainfoareament duration. Nd 
clear evidence of suppression ef pecking developed to 
the GS~. What makes this study different from the 
ones described in the preceding section is that the 
Pavlovian contingéincy and the éopérant contingency 
beth influenced the same class of behayier—key peck- 
ing. These are just the conditions under which con- 
trast is expected by the additivity theory, and Lo- 
Lordo’s data offer strong support for that theory. 

In a second, larger study, LoLordo, McMillan, and 
Riley (1974) provided more complete support for the 
additivity theory. Pigeons whose key pecking was 
maintained on a DRL schedule were exposed to pair- 
ings of CSs and food. For some pigeons the CSs were 
located on the response key; for other pigeons the CSs 
were tones. Only the first group showed conditioned 
enhancement of pecking. The second group showed 
no clear effects. This study is exactly analogous to the 
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studies by Redford and Perkins (1974) and Schwartz 
(1974a, 1974c, 1975), which showed that behavioral 
contrast in multiple schedules only occurred when the 
discriminative stimuli were on the response key. 
LoLordo et al. also exposed two other groups of 
pigeons to the same procedures, except that these 
pigeons were treadle-hopping for food rather than 
key-pecking. In this case, the response-key CS sup- 
pressed the treadle hopping and autoshaped key 
pecking! This study is of course analogous to the 
demonstrations by Hemmes (1973) and Westbrook 
(1973) that induction rather than contrast occurs with 
pigeons on multiple schedules if treadle hopping is 
the required operant. Further supportive evidence 
comes from the study by Boakes, Halliday, and Poli 
(1975) described earlier. Pigcons were exposed to a 
mult VI 2smin VI 2-min schedule. When free rein- 
forcement was délivered on a VT 30-sec schedule in 
one component only, responding increased in that 
component. When it was delivered in both compo- 
nents, there was no consistent effect on responding. 
When bar-pressing rats were exposed to virtually 
identical procedures, suppression resulted from the 
differential presentation of free food, One other study 
consistent with those just described was conducted by 
Schwartz (1976). Pigeons were exposed to a VI 2-min 
schedule of reinforcement for key pecking. Period- 
ically, the response key changed color for 12 sec after 
which response-independent food was delivered. Kate 
of pécking in the presence of this strmulus was almost 
twice that in its absence. In another procedure the 
responseindependent food was signaled on a second 
key while the V1 key remained illuminated. Under 
thesé conditions, VI responding in the presence of the 
food signal was almost completely suppressed. Mean- 
while, substantial responding was maintained on the 
food signal key. The results from these two procedures 
illustrate the difference between conditioned enhance- 
ment studies with pigeons and rats. In rat studies, 
where the signal is located away from the bar, suppres- 
sion is observed. Schwartz similarly observed suppres- 
sion in the pigeon when the signal was off the key. On 
the other hand, when the signal is on the key, en- 
hancement is observed. No exactly comparable pro- 
cedures have been studied with the rat. An approx- 
imation is provided by one final recent study which 
offers support for the additivity theory of contrast. 
Christoph, Peterson, Karpicke, and Hearst (1973) 
studied rats as subjects and imposed Pavlovian condi- 
tioning trials on an operant base line and varied the 
location of the CS relative to the bar. When the CS 
was near the bar, they observed almost no suppression. 
When the CS was far from the bar, substantial sup- 
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pression of bar pressing developed, accompanied by 
approach and contact of the CS. 


CONCLUSION 


The research discussed in this chapter suggests the 
following conclusions: 


1 Autoshaping of the pigeon’s key peck is the result 
of Pavlovian stimulus-reinforcer contingencies. 


2. [he maintenance of responding in autoshaping 
procedures is partly determined by operant re- 
sponse-reinforcer contingencies. 


3. In standard operant procedures, stimulus-reinforcer 
contingencies, when present, will exert an influence 
on behavior which is consistent with Pavlovian 
theory. 


4. In multiple-schedule procedures, analysis of the 
interaction between Pavlovian and operant con- 
tingencies allows one to predict which species in 
which situation will show behavioral contrast. The 
substantial differences in the literature between rats 
and pigeons are consistent with this analysis. 


5. In procedures in which Pavlovian contingencies 
are explicitly superimposed on operant contingen- 
cles, the present account rationalizes a literature 
which has been marked by inconsistent experi- 
mental outcomes as a function of the species and 
the situation under investigation. 


In addition, the ideas put forth in this chapter 
speak to two broader issues. The first is the distinction 
between Pavlovian and operant conditioning. This 
issue has been discussed in various parts of the chap- 
ter. We have no solution to offer to the problem of 
classifying learning phenomena as one or the other 
type of conditioning. Rather, the analysis presented 
in this chapter represents a somewhat different ap- 
proach. ‘The approach is to choose a particular class 
of behavior, and rather than Classify it as Pavlovian 
or operant, analyze the extent to which both types of 
contingency contribute to its occurrence. With regard 
to the class of behavior under scrutiny here—key peck- 
ing—this approach has proven fruitful. Moreover, the 
analysis can easily be extended to the behavior of 
other species in other situations. 

The second major issue addressed by this chapter 
is the biological boundaries or constraints on learn- 
ing. A number of major recent contributions to the 
field (Bolles, 1970; Hinde & Hinde, 1973: Rozin & 
Kalat, 1971; Schwartz, 1974b; Seligman, 1970; Selig- 
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man & Hager, 1972; Shettleworth, 1972a; Staddon & 
Simmelhag, 1971) have challenged the sometimes tacit 
and sometimes explicit assumption that general laws 
of learning exist and that they can be discovered by 
exploring the control of behavior in arbitrary experi- 
mental situations. Seligman (1970) has referred to this 
as the “equipotentiality premise.’’ A better label for it 
might be the “assumption of interchangeability.” 
Whatever its label, what it conveys is the view that if 
the world of a particular species is partitioned into 
three sets—stimuli, behaviors, and rewards—then any 
member of a set can be substituted for any other 
member without materially altering the relations one 
observes among sets. This view is now demonstrably 
false. There is substantial evidence that some associ- 
ations are more rapidly acquired than others—that 
some members of the stimulus set, the behavior set, 
and the reward set “‘belong” together (Bolles, 1970; 
Garcia & Koelling, 1966; cf. Seligman & Hager, 1972; 
Schwartz, 1974b). Indeed, a good deal of the research 
discussed in this chapter violates the assumption of 
interchangeability, It seems undeniable that there are 
significant biological constraints on what organisms 
can learn, as the ethologists have always argued 
(Lorenz, 1965). The problem which now confronts the 
study of learning is the determination of what modi- 
fications in both method and theory are demanded by 
these biological constraints on learning. One possibil- 
ity is that the search for general laws must be aban- 
doned and a “botanizing” strategy adopted. It was 
Skinner’s explicit rejection of this approach (1938) 
which led to the development of current experimental 
methods. A second possibility is that one should 
search for different types of laws which take account 
of biological constraints on learning (Seligman, 1970). 
The research discussed in this chapter suggests that 
at least in some cases neither botanizing nor the de- 
velopment of new laws is necessary. The autoshaping 
literature is unequivocal evidence that the key peck 
is a special, biologically relevant behavior. In many 
areas, the study of key pecking in pigeons, bar press- 
ing in pigeons, and bar pressing in rats yield different 
results. Nevertheless these differences can be inter- 
preted and understood in terms of learning principles 
which are already well established. They require no 
new formulation—only new combinations of old 
formulations. Whether similar analyses will provide 
sufficient explanations of other biologically con- 
strained phenomena is an open question. For the 
present, it would be ill advised to give up the “‘gen- 
eral principles” which guide our investigations with- 
out a struggle. ‘hey may be far more general than 
anyone could reasonably have suspected. 
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Consider for a4 moment an instrumental environ- 
ment which is equipped With gavaral iféimig Of inteér- 
ast f6 thé small rodént known as the Mongolian gerhil 
(Meriones ungurculates), The chamber contains a 
small box of sand, some sunflower seeds, a drinking 
tube, an activity wheel, and some bristol board whieh 
the animal enjoys shredding. This éiviroinment, or 
Varidtidiis oii it, will be referred to on several occa- 
sions in the discussion which follows. Given uncon- 
strained access to this environment for one hour each 
day, the gerbil will distribute much of the available 
time among the various items of interest in the cham- 
ber. Once ‘table behavisr patterns emerge, a small 
amount of engineering permits us to arrange any one 
of 20 possible instrumental contingencies in the 
gerbil’s world. We can require the animal to eat in 
order to run, run in order to eat, drink in order to 
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shred paper, étc. Imposing a contingency can cause 
an increase, a decrease, er no change in the prababil- 
ity ef the instrumental behavior. If a particular con 
tingency produces a change in the probability of 
instrumental responding, most psychologists would 
apree that some hypothetical process has operated 
(although not all would agree that we should pursue 
the matter), The term reinforcement is typically used 
to refer to this process with the label reward reserved 
for increases in instrumental responding and punish- 
mént tor deereases in instrumental responding. ‘The 
general purpose of this chapter will be to consider the 
current status of eur attempts to predict the outcome 
of instrumental contingencies like those which we 
might wish to arrange for the gerbil. 


A HISTORICAL PERSPECTIVE 


If we ignore some of the fine grain of the historical 
record, I would suggest that history has provided the 
contemporary experimental psychologist with one of 
three points of departure if one is interested in an 
analysis of the reinforcement process. 


Philip Dunham 


Traditional Reinforcement Theory 


In its initial form, the reinforcement process served 
one simple explanatory function. It functioned as the 
“glue” which cemented together the ubiquitous Ss 
and Rs of early associationism—or conversely, as the 
“solvent” which dissolved S—R connections already 
formed. ‘Thorndike’s (1914) statement of his sym- 
metrical law of effect is a classic example of this tradi- 
tion: 


The Law of Effect—to the situation, a modifiable 
connection being made by him between an S 
and an R and being accompanied or followed 
by a satisfying state of affairs, man responds, 
other things being equal by an increase in the 
strength of that connection. To a connection 
similar, save than an annoying state of affairs 
goes with or follows it, man responds, other 
things being equal, by a decrease of the con- 
nection. (p. 71) 


Starting from this assumption, the basic task which 
confronted the reinforcement theorist was to provide 
an indexing system which would tell us the category 
into which any particular contingent event might fall 
—“‘satisfier,” “‘annoyer,” or “neutral.’’ With an ade- 
quate set of rules, we should be able, for example, to 
predict the outcome of any one of the 20 contingen- 
cies possible in our gerbil environment. 

As Premack (1969) has suggested, this traditional 
view of reinforcement asserts that there are three 
mutually exclusive categories of contingent event to 
be found in nature and assumes, implicitly or ex- 
plicitly, that particular contingent events are incon- 
trovertible members of a particular category. From 
this fundamental assumption, the search was initiated 
to find the property or properties which would permit 
us to know the particular category into which a par- 
ticular event might fall. It is in this context that such 
prominent notions as drive reduction, drive induction, 
arousal, optimal level, and other major concepts were 
developed. ‘The conceptual and empirical deficiencies 
of the major theoretical schemes in this tradition have 
been extensively reviewed and criticized in a number 
of articles (cf. Miller, 1963; Wilcoxon, 1969) and need 
not be repeated here. Although one might take excep- 
tion to some of the particulars, the general consensus 
of contemporary critics is that concepts such as drive 
reduction have serious shortcomings as a basis for 
predicting when a particular contingent event might 
function as a reward. 

The search for the nature of negative or aversive 
events took a curious twist when Thorndike (1932) 
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rejected the negative side of his classic law of effect. 
All subsequent attempts to account for the observa- 
tion that some events will produce a decrease in the 
probability of instrumental responding adopted some 
variation on what has been called the alternative re- 
sponse assumption (cf. Dunham, 1971). In its weakest 
form, this assumption suggests that the decrease in 
instrumental responding produced by some contin- 
gent events is the indirect result of an increase in some 
alternative behavior. The alternative behavior is as- 
sumed to be developed and maintained by the posi- 
tive side of the law of effect (i.e., an escape from aver- 
sive stimulation mediated by either a two-process or 
single-process conditioning model). 

Parallel with the decline in popularity of concepts 
such as drive reduction, there is also a contemporary 
disenchantment with the various forms of the alterna- 
tive response assumption as explanations of the effects 
which aversive events are observed to have upon be- 
havior. Rachlin and Herrnstein (1969) and Dunham 
(1971) have discussed both data and conceptual argu- 
ments against the alternative response assumption as 
an explanation of the effects which negative contin- 
gent events have upon instrumental behavior. 

To summarize, the fundamental task posed by 
traditional reinforcement theorists was to predict the 
effect a particular contingent event might have— 
positive, negative, or neutral. Although the notions 
generated by this task appear to be in some disfavor 
at present, it remains as one possible point of de- 
parture for the contemporary student of reinforce- 
ment. 


The “Incentive” Function 
of Reinforcement 


As Walker (1969) has suggested, the general trend 
in the development of theoretical views of reinforce- 
ment has been to expand the explanatory powers of 
the concept. This expansion provides us with a sec- 
ond point of departure for studying reinforcement. 

As stated earlier, the process was initially a “glue” 
in the chemistry of associationism. Primarily under 
the direction of Hull, the concept started to mutate 
into a much more powerful explanatory mechanism. 
Concerned with such problems as the rate at which 
performance changed with changes in magnitude of 
reward, Hull decided that the process not only ce- 
mented the habit structure together, but pulled the 
organism down the runway via K (incentive) and its 
machinery, the r, mechanism. Hence by the 1950s the 
reinforcement process was busy with such additional 
tasks as moving the animal out of the start box and 
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mediating changes in performances which accom- 
panied changes in the magnitude of reward. ‘The ad- 
dition of incentive motivation to the function served 
by the reinforcement process added a large and per- 
sisting area of research and theorizing to the rein- 
forcement literature. It permitted us to deal with the 
tangled problems raised by the incentive concept with- 
out being distracted by the questions raised by Thorn- 
dike and the early associationists. For example, given 
the observation that a hungry rat will run down a 
runway for food pellets, it is possible to proceed with 
an extensive empirical analysis of the effects which a 
shift in the number of food pellets might have upon 
the rat’s performance without ever worrying about 
the more fundamental question of why the rats ran 
for the food in the first place. Any thirst for theoret- 
ical analysis can be quenched quite readily by con- 
sidering the rather complicated manner in which the 
incentive mechanism has been suggested to mediate 
changes in performance produced by a shift in the 
number of food pellets (cf. Dunham, 1968). 


The “Weak Law of Effect” 


Yet a third point of departure in the analysis of 
reinforcement starts from the assumption which has 
been called the weak law of effect. Although the weak 
law of effect has been given a number of different 
labels and definitions, all versions imply that we can 
conduct an empirical analysis of various contingency 
operations until, for example, we find that food is an 
effective contingent event for training a hungry rat to 
press a lever. We can then proceed with an analysis 
of variations on this contingency (e.g., schedules of 
reward), assuming that we have found at least one of 
several possible cases in which the reinforcement proc- 
ess is operating. As discussed by Meehl (1950), one 
way in which the weak law of effect can escape circu- 
Jarity is by assuming that food is also an effective con- 
tingent event for a variety of instrumental responses 
(i.e., the reward event is transituational). The same 
arguments can be used, for example, if electric shock 
is used aS a contingent event which will effectively 
suppress a number of different behaviors. 

The contents of this volume provide ample evi- 
dence for the popularity of the weak law oi effect as a 
point of departure in the analysis of reinforcement. 
The adoption of the empirical approach to rein- 
forcement represented by the weak law of effect and 
the transituational assumption also permits one to 
ignore the traditional questions about reinforcement 
posed by Thorndike and his successors. 


THE NATURE OF REINFORCING STIMULI 


Current Status of the Reinforcement Concept 


In my opinion, contemporary psychologists work- 
ing in the context of reinforcement phenomena have 
opted almost exclusively to approach the problem 
using the weak law of effect as a justification for an 
extensive analysis of scheduling effects or to investi- 
gate the incentive function of the contingent event 
(be it fear or the 7, mechanism). Attempts to develop 
a theory of reinforcement which is capable of pre- 
dicting the effects of a particular contingent event 
prior to arranging the contingency are sparse. Per- 
haps most of us have accepted the statement made by 
Meehl (1950) concerning the problems which would 
confront such an effort: 


Finally, it would be very nice if in some magical 
way we could know before studying a given 
species exactly what stimulus changes would 
have the reinforcing property; but I have tried 
to indicate that this is essentially an irrational 
demand. (p. 74) 


In support of using the weak law of effect as an 
approach to reinforcement, Meehl promised us some 
20 years ago that such an inductive empirical exami- 
nation would lead to some potent predictive rules 
(even for visiting Martians). I would suggest that the 
track record for the past few decades is not as im- 
pressive as Meeh] hoped, and it is perhaps time for a 
more concerted emphasis upon some of the fundamen- 
tal questions posed by traditional reinforcement the- 
ory. First, the data base generated by the inductive 
analysis is dangerously circumscribed. It is difficult to 
find much more than the various combinations of 
lever press, key peck, food, water, and electric shock 
upon which to base generalizations. Unlike Meehl’s 
Martian, I find little information upon which to base 
an inductive leap into the multiple-response world of 
the gerbil described earlier. Perhaps the problem has 
been less with the inductive process than with the 
convenience of the manufactured operant chamber. 
Second, and related, is that the critical assumption 
that rewards (or punishments) are transituational has 
not been subjected to extensive, rigorous testing, 
again perhaps because of the compatibility of key 
pecks and impulse counters. 

In view of these problems, it would appear that 
there is some justification for returning to the ques- 
tion of how we might predict the outcome of an 
instrumental contingency before it has been arranged. 
Hence, the remainder of the chapter will be devoted 
to a discussion of two contemporary trends in the 
literature which run counter to the weak law of effect 
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as an approach to reinforcement and which have gen- 
erated data which question the validity of the assump- 
tion that rewards are transituational. The first trend, 
an attempt to return to a symmetrical law of effect, is 
represented by the work of Premack (1959. 1965, 1971) 
and his students. The second is recent research con- 
cerned with the biological constraints which operate 
when an organism is exposed to particular instru- 
mental contingencies. The result of this exercise will 
be to demonstrate that the transituational assumption 
is not valid and that a symmetrical law of effect based 
upon preference notions represents a more fruitful 
approach to reinforcement theory. 


PREMACK’S REINFORCEMENT THEORY 


The essential features of Premack’s reinforcement 
theory are contained in three major papers (Premack, 
1959, 1965, 1971). He suggests that the organism 
places the events in its world on a unitary dimension 
of value or preference. The relative value of a par- 
ticular event can be measured in terms of the amount 
of time the organism spends engaging in that event. 
If we use time as an indicator of value, it is possible, 
according to Premack, to predict the outcome of any 
particular instrumental contingency. If we arrange a 
contingency in which value of the contingent event is 
higher than the value of the instrumental event, 
Premack predicts that we will observe an increase in 
the probability of the instrumental response. If we 
arrange a contingency in which the value of the con- 
tingent event is lower than the value of the instru- 
mental event, Premack predicts that we will observe a 
decrease in the probability of the instrumental re- 
sponse, It should bé noted that the valuc of a particu- 
lar cyent is measured in terms of the amount of time 
the organism spends indulging in that event when 
permilled unconsivained arcess to it. These rules ave 
simple and testable, They suggest that the organism 
will increase the probability of a behavior which 
moves it from a less to a more preferred state, but 
that it will not engage in a behavior which moves it 
from a more to a less preferred state, 


Historical Context of Premack’s Position 


Before looking at the evidence for these assump- 
tions 1t may be helpful to see where Premack fits into 
the three approaches to reinforcement discussed ear- 
lier. Essentially, Premack is concerned with the prob- 
lem that Meehl finds “irrational.’”’ It is ‘Thorndike’s 
problem and the question central to traditional rein- 
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forcement theory: How can we predict the outcome 
of an instrumental contingency? There is a funda- 
mental difference in approach, however. Thorndike’s 
analysis implies that there are three different types 
of event in nature which have absolute properties 
termed satisfying, annoying, and neutral. Hence 
‘Thorndike and those who followed in his tradition 
attempted to find the absolute properties which per- 
mitted certain events to be assigned to fixed cate- 
gories (i.e., satisfiers and annoyers). Premack makes 
the fundamentally different presupposition that the 
property of being rewarding or punishing is relative. 
Food, for example, can be either a reward or a punish- 
ment depending upon the relative value of the in- 
strumental response. If an instrumental running 
response 1s less probable than eating, the food will func- 
tion as a rewarding event. If the running response is 
more probable than eating, the food will function as 
a punishing event. The recognition of the relativistic 
nature of reward and punishment represents a sig- 
nificant departure from traditional thought about the 
reinforcement process. 

The relativistic position necessarily contradicts the 
assumption that rewards are transituational. The di- 
rect and testable implication of Premack’s rules is that 
a given contingent response like eating can function 
as a reward in one context and as a punishment in 
another context. If food is demonstrated to reward 
running In oné instrumeéntal conun Roney and punish 
running in another instrumental contingency, food, 
by definition, is not a transituational reward. 

Finally, Premack’s position can be viewed, in a 
curious sense, as having continued the trend described 
by Walker (1969) of increasing the explanatory pow. 
ers of the reinforcement concept. As a miniature 
model, typical of contemporary psychological theary, 
it omirs reference to such concepts as habit, incentive, 
the v, mochanitin, or the “lua” whieh ware sh pranii- 
nént 14 the heyday of mullicencept theeries such as 
Hull's (1943. 1952). A consequence of these omissions 
i¢ that Premack’s reinforcement concept is a single 
motivational construct which is assumed to account 
for all changes in performance observed when an in- 
strumental contingency is instituted. Although I have 
yet to learn the Angerthas, I suspect that Walker hag 
accused Professor Premack of having found the “one 
ring.” 


Evidence for Premack’s Assumptions 


Prior to 1971 most of the evidence in support of 
Premack’s position was restricted to the positive side 
of the symmetrical rules which he proposes. Most ex- 
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periments were designed to test the assumption that 
the contingent response must be more probable than 
the instrumental response for there to be an increase 
in the probability of instrumental behavior. ‘The basic 
design used in all experiments, with minor variations, 
is a three-stage (ABA) procedure. During the first 
stage the independent probabilities of the responses 
which are to be used in the instrumental contingency 
are measured. The organism is permitted uncon- 
strained access to the appropriate manipulanda (e.g., 
a drinking tube and a running wheel), and the 
amount of time spent engaging in each of these be- 
haviors is measured. During the second stage a con- 
tingency 15 arranged in which the organism 1s re- 
quired to engage in a specified amount of instrumental 
behavior in order to gain access to a specified amount 
of contingent responding. Any increase (or decrease) 
in instrumental responding above (or below) the 
previously measured base line probability indicates 
that the reinforcement process has operated. A third 
stage is often included in which the animal is per- 
mitted to return to the unconstrained base line con- 
dition. 

Using this basic procedure, the evidence which 
supports the positive side of Premack’s differential 
probability rules extends over a wide range of species 
and responses. Using Cebus monkeys which were 
given unconstrained access to one of four manipu- 
landa during base line sessions (a door, a plunger, a 
vertical lever, and a horizontal lever), Premack (1963) 
arranged instrumental contingencies in a manner 
which required the monkeys to respond on a less 
probable manipulandum in order to gain access to a 
more probable manipulandum (low to high), or in 
which the animals were required to respond to a more 
probable manipulandum in order to gain access to a 
less probable manipulandum (high to low). In all 
cases, the low-to-high contingency increased the prob- 
ability of the instrumental response and no such 
increase was observed when the contingent response 
was less probable. 

In a similar series of experiments in which rats 
were used as subjects and running and drinking were 
the members of the instrumental contingency, the 
reinforcement relation was demonstrated to be re- 
versible, as the differential probability rules predict. 
Using a low-to-high contingency in which drinking 
was more probable than running, drinking was ob- 
served to increase the probability of running—a rather 
typical observation. Less often observed, however, was 
the reverse case in which the running response was 
made contingent upon drinking. With appropriate 
manipulation of the deprivation parameters, the 
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relative probability of the two responses was reversed, 
and running was demonstrated to be an effective 
contingent event for increasing the probability of in- 
strumental drinking (cf. Premack, 1962; Schaeffer, 
1965). 

The generality of these findings has been extended 
to include contingencies between intracranial stimula- 
tion (ICS) and drinking (Holstein & Hundt, 1965) 
and between running and lever pressing (Hundt & 
Premack, 1963). ‘The experiments have also included 
both children (Premack, 1971) and college students 
(Schaeffer, Hanna, & Russo, 1966) as subjects. 

It was not until 1971 that the punishment side of 
Premack’s formulation was presented in any detail 
(Premack, 1971). Although the punishment rule is the 
simple converse of the reward rule, testing it poses 
some problems, Consider, for example, a thirsty rat 
which is given free access to a running wheel and a 
drinking tube. With appropriate deprivation param- 
eters, drinking can be made more probable than run- 
ning. If a contingency is now arranged in which the 
animal is required to drink in order to run (high to 
low), the animal will very likely continue to drink 
until satiated because drinking is the preferred state. 
To demonstrate that the less probable response is an 
aversive event, the animal must be forced to run; and 
it must be demonstrated that the forcing operation 
per se is not the effective punishment event. 

The initial work on this problem was presented in 
a paper by Weisman and Premack at the meetings of 
the Psychonomic Society in 1966 and discussed sub- 
sequently by Premack (1971). Weisman and Premack 
permitted rats free access to a drinking tube and a 
motorized running wheel for daily 15-min sessions. 
‘Two rats were maintained on a 23-hr water depriva- 
tion schedule, which made drinking more probable 
than running, and two rats were maintained ad lb, 
which made running more probable than drinking. 
After base line probabilities of responding were es- 
tablished for all four animals, contingency sessions 
were initiated in which 15 laps on the drinking tube 
produced 5 sec of motorized running in the wheel. For 
the two deprived animals the lick-to-run contingency 
(high to low) suppressed the amount of drinking, as 
predicted. For the ad lib animals the lick-to-run con- 
tingency (low to high) increased the amount of drink- 
ing, as predicted. The results in the latter condition 
also indicate that the forcing operation per se was not 
aversive. Subsequent recovery of the base line prob- 
abilities was followed by a reversal of deprivation 
conditions for each subject. A crossover design re- 
vealed results identical to those obtained in the first 
phase of the study. 
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The results of a subsequent, similar study by Ter- 
hune and Premack (1970) also indicate that forced 
running is an aversive event when it is less probable 
than instrumental drinking. In addition, Terhune 
and Premack reported that the amount of suppression 
produced by forced running was a linear function of 
the probability that the animal would not be in the 
state of running at any fixed time after running had 
been initiated. 


Problems with Premack’s Formulation 


As is obvious from the preceding discussion, a sub- 
stantial amount of data has accumulated to support 
Premack’s differential probability rules (particularly 
the reward side). There are, however, several basic 
problems with the supporting data which suggest that 
some additional experimentation and a reformulation 
of the position might be in order. 

An examination of the data in support of both the 


reward and the punishment assumption reveals two 
basic confoundings: 


1. With respect to the reward assumption, in every 
procedure in which it has been demonstrated that 
a more probable response will reinforce a less prob- 
able response, the subject was required to increase 
the probability of the less probable instrumental 
response in order to maintain the contingent re- 
sponse at the independently measured free perfor- 
mance level. 


2. With respect to the punishment assumption, every 
procedure in which it was demonstrated that a less 
probable contingent response would punish a more 
probable instrumental response, the subject was 
required to increase the probability of the contin- 
gent response above the independently measured 
free performance level in order to maintain the 
instrumental response at its free performance level. 


In view of these two basic confoundings, the pos- 
sibility remains that: 


I. A less probable contingent response can be demon- 
strated to reinforce a more probable instrumental 
response if the contingency requirements are such 
that the subject must increase the probability of 
the instrumental response above its free perfor- 
mance level in order to maintain the free perfor- 
mance level of the contingent response. 


2. A more probable response can be demonstrated to 
punish a less probable instrumental response if the 
contingency arrangements are such that the subject 
must increase the probability of the contingent 
response above its free performance level in order 
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to maintain the instrumental response at its free 
performance level. 


Evidence for a Response Deprivation Hypothesis 


There are three independent lines of experimenta- 
tion available at the present time which have been 
generated by the preceding rationale. The results of 
all three efforts indicate that, contrary to Premack’s 
assumptions, under certain conditions a less probable 
response can serve as a reward for a more probable 
instrumental response. 

Eisenberger, Karpman, and ‘Trattner (1967) were 
the first to examine the problem. They reported a 
series of experiments in which college students were 
given access to two different manipulanda: a wheel 
which could be hand-cranked and a lever which could 
be pressed. In the critical experiment in the series, 
each subject was run in a 5-min base line session in 
which both wheel and lever were freely available. 
This was followed by a 5-min contingency session in 
which the subject was required to crank the wheel 10 
revolutions in order to press the lever once. The re- 
sults of the base line session revealed that most sub- 
jects preferred cranking the wheel to lever pressing. 
During the subsequent contingency session, if the 
lever-press response was suppressed by the contingency 
requirement, an increase in instrumental wheel crank- 
ing was observed—even if thé contingent levér-préss 
response was fess probable than the instrumental 
response. 

Kisenberger et al. interpreted their results in terms 
of a response suppression hypothesis which stated that 
the necessary condition for an increase in instrumen- 
tal responding is the suppression of the contingent 
response, independent of the relative probability of 
instrumental and contingent responses. As they stated: 


The present set of experiments suggests the nec- 
essary and sufficient condition for reinforcement 
in the contingency situation is the animal’s ne- 
cessity to increase instrumental responding if it 
is tO Maintain contingent responding at the free 
performance level. (p. 350) 


The second line of experimentation designed to 
examine this problem consists of two experiments 
performed in our laboratory using subjects, procedure, 
and apparatus which more closely approximate Pre- 
mack’s early experimentation. 

The first experiment was a master’s thesis con- 
ducted by Susan Marmaroff (1971) which employed 


albino rats as subjects and running and drinking as 
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the members of the instrumental contingency. The 
apparatus, identical to that described in a paper by 
Dunham (1972), consisted of a single modified activity 
wheel housed in a dark, ventilated, sound-attenuating 
chamber located in a room adjacent to the control 
apparatus. A motor-cam mechanism permitted us to 
insert or retract a drinking tube to which the animal 
had access through a small hole in the stationary wall 
of the wheel. 

The rats were on 23-hr water deprivation, and daily 
sessions 60 min in length were conducted 7 days per 
week. ‘The experimental procedure was divided into 
four phases. During Phase 1 the animals were per- 
mitted free access to the running wheel with the 
drinking tube retracted for 15 consecutive sessions. 
Animals were permitted access to water for 1 hr in 
the home cage immediately after the session. 

During Phase 2 the running wheel was mechan- 
ically locked so that no running could occur. The 
drinking tube was inserted, and a base line level of 
free operant drinking was observed. Phase 2 continued 
for 15 sessions. 

During Phase 3 the subjects were permitted free 
access to both the drinking tube and running wheel. 
In all phases of the experiment, drinking was defined 
as a single lap on the tube and running as a 90-deg 
revolution of the wheel. Electronic circuitry divided 
the entire session into 2-sec intervals, and each inter- 
val was scanned for an instance of either drink or run. 
The data were converted to a probability measure by 
dividing the number of intervals in which a response 
occurred by the total number possible in the session 
(cf. Premack, 1965). 

During Phase 4, a run-to-drink contingency was 
arranged for all animals. ‘To commence each session, 
the wheel was available for free-access running (brake 
released) and the drinking tube was retracted. A fixed- 
ratio schedule was selected for each animal in which 
a fixed number of 90-deg revolutions in the wheel 
(instrumental response) produced the drinking tube 
for a fixed number of licks (contingent response), 
after which the tube was retracted. ‘The instrumental 
and contingent requirements were arranged such that 
each animal had to increase the base line amount of 
running (as measured in Phase 3) by approximately 
50% in order to maintain the contingent response at 
its base line level. As indicated in Table 1, the base 
line level of responding in Phase 3 shows that run 
was consistently more probable than drink for sub- 
jects 3, 5, and 6; that drink was more probable than 
run for subject 1; and that no consistent preference 
was observed for subjects 2 and 4. A subject was 
judged inconsistent if it reversed its preference more 
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than once during the six sessions immediately prior 
to the contingency training. The running wheel re- 
mained in the free-access state throughout the ses- 
sion; thus running was possible during both states of 
the drinking tube, inserted or retracted. Phase 4 lasted 
for 15 sessions and was followed by a final phase in 
which the contingency was eliminated and free access 
to the wheel and tube was reinstated. 


Table 1 Instrumental and contingent response requirements 
for each subject during contingency training, 


Phase 3 
SUBJECT INSTRUMENT CONTINGENT CONTINGENCY 
(No. of 90-deg (No. of licks) 
revolutions) 

] 5 30 low to high 
2 8 35 no preference 
3 10 20 high to low 
4 8 35 no preference 
5 10 30 high to low 
6 9 30 high to low 


The results of the five phases of the experiment for 
each of the subjects are presented in Figure 1. As is 
evident from the base line data, the deprivation 
parameters produced low-to-high, high-to-low, and 
nondifferential probability cases. The major result 
was that the contingency produced an increase in the 
instrumental performance of all subjects. Reinforce- 
ment was thus observed in low-to-high, high-to-low, 
and nondifferential probability cases. A t-test for cor- 
related observations indicated that the increase in 
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Fig. 1. Probability of running and drinking for each subject 
during each of the five phases of the first experiment. 
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the probability of instrumental running reliably ex- 
ceeded both the single-response base line (Phase 1) 
and the two-response base line (Phase 3) (single- 
response base line comparison: t = 6.662, p < .005; 
two-response base line comparison: t = 8.50, p < .005). 

The contingency also suppressed the contingent 
drinking response. ‘The amount of suppression was 
not great, and drinking tended to recover over the 
course of training. A correlation between the amount 
of suppression of contingent drinking and the amount 
of increase in instrumental responding was reasonably 
high, but not reliable (r = .48; df 5; p> .05). ‘The 
presence of some degree of positive correlation sup- 
ports the suggestion of Eisenberger et al. that sup- 
pression of contingent responding is an important 
factor in obtaining the reinforcement effect (see also 
Premack, 1965, p. 172), although the sample size is 
not large enough to argue persuasively either way. 

During the nine sessions of Phase 4, the subjects 
returned, in general, to the base line probability of 
responding. 

In summary, the data from Marmaroff’s thesis also 
question the validity of Premack’s reward rule. A 
knowledge of the independent probability of instru- 
mental and contingent responses is apparently not an 
adequate basis for predicting reward (and by impli- 
cation punishment) effects in a run-to-drink contin- 
gency. 

A second experiment has since been completed, us- 
ing a drink-to-run contingency instead of the previ- 
ously employed run-to-drink contingency. In the first 
experiment, the run-to-drink contingency reduced the 
total amount of water intake during the early sessions 
of training. It is possible that such a change in water 
intake was responsible for the increase in instrumental 
running as a general activity phenomenon (cf. Camp- 
bell & Lynch, 1968). By reversing the contingency and 
establishing drinking as the instrumental event, the 
confounded change in water deprivation is ¢lim1- 
nated. In addition, the generality of the argument 
against Premack’s assumptions 1s increased to include 
the drink-to-run situation. 

The second experiment followed the same pro- 
cedure used in the first. It was divided into three 
phases: a two-response base line phase; a contingency 
training phase; and a two-response base line recovery 
phase. The single-response base line phases used in 
the first experiment were not included as control con- 
ditions, largely because the reinforcement effects ob- 
tained in the first experiment were observed to exceed 
both the single- and the two-response base line. 

During Phase 1 the animals were permitted free 
access to the running wheel and drinking tube each 
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daily session. This phase continued for 15 days, and 
the base line probability of each response was mea- 
sured with the same method as employed in the first 
experiment. 

During Phase 2 a drink-to-run contingency was 
established for all subjects. The contingency required 
the subjects to complete 50 laps on the drinking tube 
in order to obtain five 90-deg revolutions of the wheel. 
‘This contingency, as in the first experiment, required 
each subject to increase the amount of instrumental 
responding above its base line level if contingent 
responding was to be maintained at the base line 
level. Unlike the first experiment, however, the use of 
the same contingency requirement for each subject 
meant that each subject had to increase instrumental 
responding by different amounts in order to maintain 
contingent responding at base line level. Phase 2 
continued for 10 sessions and was followed by 2 ses- 
sions of recovery with free access to both wheel and 
tube. 

The results of this experiment are presented in Fig- 
ure 2. In Phase 1 of the experiment subjects 1 and 5 
showed consistent preferences for running over the 
last six sessions of the two-response base line. Subject 
3 revealed a consistent preference for drinking, and 
subjects 2, 4, and 6 failed to reveal any consistent 
preference. 

The introduction of the drink-to-run contingency 
in Phase 2 provided three types of probability rela- 
tion, Subjects 1 and 5 participated in a low-to-high 
contingency; subject 3 participated in a high-to-low 
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Fig. 2. Probability of running and drinking during the last 
six sessions of base line (B), contingency (C), and recovery 
(R) phases of the second experiment. 
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contingency; and subjects 2, 4, and 6 represented the 
indifferent case. As seen in Figure 2, all subjects in- 
creased the probability of drinking to levels which 
exceeded the base line probability. A t-test for cor- 
related observations revealed this increase to be reli- 
able (t = 6.50, p < .05). The contingency also sup- 
pressed contingent running, and again the correlation 
between the amount of suppression and the amount 
of increase in instrumental responding was substan- 
tial but not reliable with the same sample (r = .74, df 
9; p > .05). During the two sessions of base line recov- 
ery in Phase 3, the responses tended to return to their 
original base line levels. 

Along with the results reported by Eisenberger et 
al. (1967), the data from these two experiments di- 
rectly question the validity of Premack’s reward and 
punishment assumptions. A contingent response was 
demonstrated to reinforce an instrumental response 
whether the contingent event was higher or lower in 
probability than the instrumental response. 

More recently a third series of experiments based 
upon the same rationale has been reported by Allison 
and ‘Timberlake (1974). Their first experiment was 
designed to demonstrate that a less probable response 
would reward a more probable instrumental response 
if the contingency was arranged so that the animal 
had to increase the probability of the instrumental 
response in order to obtain the base line (free-access) 
amount of contingent responding. In the first phase 
of the experiment, albino rats were given daily 10-min 
base line sessions in which they had simultaneous ac- 
cess to two drinking tubes. One tube contained a .4% 
saccharine solution, the other a .3% solution. The rats 
consistently preferred the 4% solution during this 
base line exposure. Following base line sessions, con- 
tingency sessions were conducted in which the rats 
were required to lick the 4% solution for 80 sec in 
order to gain access to the .3% solution for 10 sec. 
This contingency arrangement required that the rats 
increase the amount of .4% instrumental licking if 
they were to obtain their usual base line amount of 
3% contingent licking. 

Contrary to Premack’s reward rule, this high-to- 
low contingency increased the amount of instrumen- 
tal .4% licking to levels which exceeded the base line. 
The amount of contingent licking was suppressed by 
the contingency, and, curiously, the rats licked the 
contingent .3% solution for only a small percentage 
of the time that it was made available during the 
contingency sessions. A second experiment using .4% 
and .1% saccharin solutions, as instrumental and 
contingent events respectively, corroborated the re- 
sults obtained in the first study. 


THE NATURE OF REINFORCING STIMULI 


The final experiment in the Allison and Timber- 
lake series followed the same procedure as the first 
two, except that the contingency arranged did not 
require an increase in the amount of .4% licking in 
order to receive the base line amount of .1% contin- 
gent licking. This procedure failed to increase the 
AY, instrumental licking. 

The results reported by Allison and Timberlake 
add further support to the data which question Pre- 
mack’s differential probability rules. To account for 
their findings, Allison and Timberlake (1974) and 
Timberlake and Allison (1974) have outlined a posi- 
tion called the response deprivation hypothesis. For a 
particular contingency arrangement, response depriva- 
tion is identified “if the animal, by performing its 
baseline amount of instrumental response, is unable 
to obtain access to its baseline amount of the con- 
tingent response” (Timberlake & Allison, 1974, p. 
152). They suggest that the necessary and sufficient 
condition for an increase in instrumental responding 
is that the instrumental contingency employed pro- 
duce the response deprivation state. The response 
deprivation hypothesis, as outlined by Timberlake 
and Allison, does not differ in any essential manner 
from the response suppression hypothesis offered 
earlier by Eisenberger et al. (1967). In future discus- 
sion these two positions will be treated as identical 
and referred to as the response deprivation hypothesis. 

Taken together, the evidence obtained in the three 
independent lines of experimentation which have 
been discussed in this section permit two conclusions: 
(a) all of the experiments described directly question 
the validity of Premack’s differential probability rules; 
and (b) the data consistently suggest that the instru- 
mental contingency must force the amount of con- 
tingent responding below its base line operant level 
in order to observe an increase in instrumental re- 
sponding—1.e., the response deprivation hypothesis. 


Some Control Considerations 


‘THE YOKED CONTROL PROCEDURE 


Before the evidence leading to the two above- 
mentioned conclusions can be completely accepted, a 
number of control questions need to be considered. 
First and perhaps most important, it is obvious that 
the introduction of most instrumental contingencies 
will drastically change the manner in which the or- 
ganism normally distributes the contingent response 
in time when compared to the base line performance. 
It is possible that many of the changes observed with 
the use of Premack’s typical experimental procedure 
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can be produced by simply changing the manner in 
which the organism is permitted access to the contin- 
gent event in time without arranging an instrumental 
contingency. In order to examine this possibility, the 
most appropriate control procedure would be to in- 
clude a group of animals in each experiment which 
were yoked to the instrumental contingency group 
and received the reward in the same temporal pattern 
without any contingency in effect. With the yoked 
control group, one should be able to judge the effects 
which the change in the temporal distribution, as 
well as the amount of contingent responding per se, 
had upon the designated instrumental behavior. Pre- 
mack (1965, pp. 166-172) recognized this problem 
and suggested that the reduction in the amount of 
contingent responding usually produced by an in- 
strumental contingency may, along with the low-to- 
high probability differential, be necessary for an in- 
crease in the probability of instrumental behavior. 

Various approximations to the yoked control pro- 
cedure described above can be found in some of the 
experimentation concerned with Premack’s theoretical 
position. Most of them attempt to control for the 
reduced amount of contingent responding and do not 
consider changes in the temporal distribution of the 
contingent response produced by the contingency, The 
results of these efforts have been inconsistent. Fisen- 
berger et al. (1967) recognized the possibility that sim- 
ple suppression of the contingent response by the 
instrumental response requirement might produce an 
increase in the instrumental behavior. They con- 
ducted a control experiment (Experiment III of their 
series) in which they removed access to the lever-press 
response (nominally their contingent response) and 
measured the effects of such removal upon the wheel- 
turning response (nominally their instrumental re- 
sponse), Although the complete removal of access to 
the contingent response is not as informative as the 
yoked control described éarlier, it deés control for the 
elflects of simply reducing the amount of contingent 
responding produced by the contingency. ‘The results 
revealed that removal of access to the lever-press re- 
sponse reduced, rather than increased, the amount of 
wheel-turning behavior. Hence they concluded that 
the instrumental requirement was a necessary aspect 
of their procedure and that contingent response re- 
duction alone was not sufficient to increase the proba- 
bility of the instrumental response. 

A more recent experiment suggests that the results 
obtained in the control experiment reported by FEisen- 
berger et al. may have limited generality. While work- 
ing with a different context, I conducted an experi- 
ment in which rats were given free access to a drinking 
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tube and a running wheel during daily half-hour ses- 
sions (Dunham, 1972). Subsequent removal of access 
to either one of these responses by braking the wheel 
or removing the tube during the session produced an 
increase in the amount of the alternative behavior 
observed during the session. Contrary to the observa- 
tions made by Eisenberger et al., it would appear that 
removal of a response from the animal’s repertoire 
will, in some cases, produce an increase in an avail- 
able alternative. In our case, the two responses mea- 
sured were topographically quite different. 

Bernstein (1973) has also recently reported data 
relevant to this control question. He used a fascinat- 
ing procedure in which human subjects were placed 
in a controlled experimental environment contain- 
ing various items of interest (e.g., sewing materials, 
reading, art work). The subjects lived continuously in 
this environment for periods as long as 34 consecutive 
days. ‘The amount of time which each subject spent 
indulging in a number of selected activities was 
measured on a 24-hr basis, Following the measurement 
of base line durations, Bernstein arranged various con- 
tingencies in which the subjects were required to 
engage in a fixed amount of a less probable behavior 
in order to gain access to a fixed amount of a more 
probable behavior (c¢.g., read fiction in order to sew), 
All of the contingencies studied, with one exception, 
increased the amount of instrumental behavior ob- 
served and suppressed the amount of contingent be- 
havior below base line levels. In the case where no 
increase in instrumental behavior was observed, it is 
interesting to note that the contingent response was 
not suppressed by the contingency requirement. 

More important in the present context 1s Bern- 
stein’s use of a matched control procedure in which 
the same subjects were subsequently exposed ta sas- 
sions in which the previously designated contingent 
response wag simply rectricted in the came manner in 
which the preyieus contingensy had restricted the 
response, without any instrumental response requires 
ment. Chis procédure comes very close to the yoked 
control procedure described earlier. In two of the 
five instrumental-contingent response pairs which 
were observed, Bernstein found that simply restricting 
the amount of contingent activity (with no instrumen- 
tal response required) was sufficient to produce in- 
creases in the response which had previously been 
designated as the instrumental response. Again we 
see that restricting the contingent response in the 
absence of an instrumental response requirement has 
inconsistent effects. In some instances an increase in 
the nominal instrumental behavior is observed, in 
other cases no changes occur. 
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Allison and Timberlake (1974) also attempted to 
assess the effects of a restriction in the contingent re- 
sponse in the absence of an instrumental response 
requirement. In Experiment 3 of their series they 
observed that the amount of unconstrained .4% sac- 
charin drinking (the nominal instrumental response) 
was the same in both the presence and the absence of 
the .1% solution (the nominal contingent response). 
In this case, completely suppressing the contingent 
response appears to have no effect upon instrumental 
behavior in the absence of the contingency—and it 
might be noted that the topography of the instru- 
mental and contingent response was identical: licking 
a drinking tube. 

The results produced by these approximations to 
the yoked control procedure suggested in earlier dis- 
cussion are inconsistent. Simple removal of the con- 
tingent response from the unconstrained repertoire 
has been observed to produce increases, decreases, and 
no change in an alternative response designated as the 
instrumental behavior. If we are going to properly 
assess the relative merits of Premack’s differential 
probability rules and the response deprivation hy- 
pothesis, it behooves us first to determine whether the 
increases in instrumental behavior observed in these 
experiments are the product of the reinforcement op- 
eration (i.e., the response requirement) or whether 
the increase reflects an interaction between uncon- 
strained responses which would have occurred in the 
absence of any instrumental contingency. As I have 
noted in another context (Dunham, 1971), the ques- 
tion of what changes will occur in an unconstrained 
multiple-response repertoire when we restrict (or in- 
crease) one or more members has received very little 
experimental attention. At present we have little 
more than our intuition upon which to base our 
predictions about such changes (cf. Bernstein, 1973). 


PREMACK’S CONTROL REQUIREMENTS 


In addition to the yoked control question discussed 
above, Premack (1971) has outlined a number of 
procedural problems which he suggests will invalidate 
tests of the differential probability rules. We shall 
now briefly consider each of these problems as they 
relate to the negative evidence which has accumulated 
concerning Premack’s position. 

All three of the procedural problems discussed by 
Premack revolve around a loosely defined concept of 
’ Specifically, he 
suggests that there are three conditions under which 
the total-duration measure of responding obtained 
during base line sessions is likely to be a distorted 


“momentary response probability.’ 
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estimate of “momentary response probability” and 
consequently of the reward (or punishment) value of 
a response. The first case described is a situation in 
which the two responses to be used in the contingency 
have a different rate of decay (habituation or satia- 
tion) within a session. If, for example, drinking is 
more probable than running during the first 10 min 
of a ]-hr session, but substantially less probable during 
the last 50 min of the session, a total-duration measure 
would indicate in some cases that running was the 
more probable of the two. Hence in subsequent con- 
tingency sessions a run-to-drink contingency would be 
considered to be a high-to-low contingency based 
upon total-duration measures, but actually should be 
considered a low-to-high contingency for the first 10 
min of the session and a high-to-low contingency dur- 
ing the last 50 min. At first glance, it does seem rea- 
sonable to stipulate that a fair test of the probability 
differential rules require that these within-session 
duration-time curves not intersect. In fact, in the run- 
ning and drinking experiments discussed earlier we 
have measured changes in the probability of drinking 
and running in consecutive 20-min segments of the 
session, and in four subjects so observed both re- 
sponses decline over the 1-hr session but the curves do 
not cross over. Eisenberger et al. (1967) also failed to 
observe within-session crossovers of response proba- 
bility in their experiments. Presumably, then, these 
data would meet the criterion specified by Premack, 
and total duration is a good estimate of “momentary 
response probabilities.” 

A more basic problem arises, however, when one 
considers how “momentary” a momentary probability 
estimate should be. If, for example, one divided an 
entire session of running and drinking into successive 
5-min segments, a number of reversals in response 
probability would be observed over the course of the 
session. Using this latter time scale on the abscissa, 
our experiments would not be considered a proper 
test of Premack’s position. As suggested earlier, the 
concept “momentary probability” needs to be defined 
more precisely if it is to be anything other than a 
post hoc analysis of obstreperous results. 

A second stipulation, if we are to obtain good 
estimates of “momentary probability,” is that the 
parameter values used during the contingency ses- 
sions be the same as those used to measure the inde- 
pendent probability of responding during base line 
sessions. Premack (1971) suggests: 


For example, if the reinforcement session is to 
use a [variable-interval] 60 second schedule with 
a contingent [response] time of 5 seconds, then 
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Time 


Fig. 3. See text for explanation. (From Premack, 1969, p. 132.) 


exactly those parameters should be used in the 
measurement of response probabilities. (p. 130) 


Unless this requirement is met, it is suggested that 
the total duration measure will be a distorted esti- 
mate of “momentary response probability.” Again, 
however, this requirement poses some problen.s for 
the experimenter who wishes to meet the conditions 
specified. Unless the experimenter exercises complete 
control over the onset and offset of both instrumental 
and contingent responses (a condition which would 
eliminate the response requirement), it is the case that 
any instrumental contingency which we arrange will 
produce some changes in the manner in which the 
organism distributes its behavior in time when com- 
pared to base line performance. Consequently, it is 
impossible strictly to meet the requirement imposed 
by Premack. The best one can do is keep the base line 
and contingency situations as similar as possible and 
depend upon the yoked control discussed earlier to 
detect any changes in instrumental response probabil- 
ity produced by the temporal constraints placed upon 
contingent behavior. 

The third procedural problem discussed by Pre- 
mack (1971) can best be explained by reference to 
Figure 3. Referring to this figure, he says: 


Response A depicted in the curve on the right 
attains an extremely high probability at rela- 
tively long intervals, whereas response B shown 
on the left attains half that probability but at 
half the interval. The average probability of the 
two responses are thus equal; however, their 
momentary reinforcement values will not be 


equal. (p. 131) 


The spirit of this suggestion is that certain re- 


sponses like copulation (Response A) do not occur 
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very frequently, but when they do occur they should 
be considered to be at the top of the preference struc- 
ture. With all due respect to the data of introspection, 
this stipulation reduces in operational terms to the 
suggestion that the duration of some responses is not 
a proper estimate of their relative position in the 
preference structure. Again, an experimenter who 
wishes to comply with this stipulation must have 
some criterion for knowing when he is dealing with a 
response like Response A in Figure 3. The unique 
property of Response A appears to be that it has a 
relatively long interval which separates successive in- 
stances of the behavior and a high position in the 
preference structure when an instance of the behavior 
is observed. ‘The unique property of Response B is 
that it has a relatively short interval which separates 
successive instances of the behavior and a relatively 
low position in the preference structure when an 
instance of the behavior is observed. Obviously all 
responses have these two properties in differing de- 
grees. Unless it is specified how long an interresponse 
interval, for a given session length, is required for 
membership in the class of Response A, of how short 
an interval is required for membership in the class of 
Response B, we cannot know if we have underesti- 
mated or overestimated the “momentary probability” 
of these responses by measuring their total duration. 
With reference to our running and drinking experi- 
ments, we typically observe longer interresponse inter- 
vals for running than we do for drinking during the 
l-hr session, but one cannot determine whether run- 
ning deserves membership in the special class of re. 
sponse A from the information provided in Figure 3. 

To summarize briefly, it appears to be difficult, if 
not impossible, to meet the requirements specified by 
Premack for a valid test of the differential probability 
rules. In the experiments cited as evidence against 
the differential probability rules, it is possible to sug- 
gest that the total-duration measures are, indeed, a 
distorted estimate of the “momentary probability” of 
the contingent responses employed. I would argue 
that such an interpretation is necessarily an ad hoc 
analysis until such time as the concept “momentary 
probability” and its relationship to the total duration 
measure are more precisely defined. 


An Optimal-Duration Model as an 
Alternative Interpretation 


If we assume that subsequent research continues to 
question Premack’s position and to support the re- 
sponse deprivation hypothesis, we are placed in a 
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curiously counterintuitive position. The evidence 
cited thus far suggests that the organism will, under 
certain conditions, increase the probability of a re- 
sponse which is instrumental in placing it in a less 
preferred state—when preference is measured in terms 
of the total duration of responding during base line 
sessions. For example, most people, given the choice, 
would prefer eating in a nice restaurant to visiting 
the dentist. Taken literally, the response deprivation 
hypothesis suggests that it is possible to increase my 
patronage of local restaurants by arranging a contin- 
gency in which I must go to a restaurant in order to 
visit my dentist, and that this contingency will work 
only if it reduces the amount of time I spend at the 
dentist below my base line, unconstrained level of 
dental care. 

Rather than accept such a counterintuitive notion, 
it would seem more judicious at present to suggest 
that the total-duration measure is a distorted measure 
of preference. It may still be possible to develop an 
alternative measure which will permit us to maintain 
Premack’s fundamental, commonsense assumptions 
about a symmetrical reward-punishment mechanism 
that depends upon the organism’s preference struc- 
ture. 

For the remainder of the discussion in this section 
of the chapter, I shall outline, in rudimentary form, 
a reinforcement model which maintains many of the 
features of Premack’s analysis but suggests a basic 
change in the manner in which we measure the or- 
ganism’s preference structure. It is designed to detect 
the momentary changes in preference that are neces- 
sarily obscured by the total-duration measure. 

The most convenient way to describe the essential 
features of the model is to make reference to the 
multiple-response repertoire of our gerbil again. As- 
sume for the moment that our observations of the 
gerbil for a 20-min period each day revealed that the 
gerbil spent most of its time shredding paper. Given 
a definition for the onset and offset of the paper- 
shredding behavior, two properties of the behavior 
are considered important. ‘The first property, called 
the burst duration, refers to the amount of time spent 
paper shredding once the animal enters the state. 
Second, the interburst interval, refers to the amount 
of time observed between successive bursts of paper 
shredding. Assume that we have made these tedious 
observations of the gerbil’s paper-shredding behavior 
over a period of several unconstrained base line ses- 
sions and the results are those observed in Figure 4. 
Looking first at the frequency distribution for burst 
durations (solid-line curve), the most frequently ob- 
served duration represented as point A on the curve 
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Fig. 4. Hypothetical frequency distributions for burst durations 
and interburst intervals of paper shredding in the gerbil. As 
the burst-duration distribution is displaced to the left, the 
paper-shredding behavior is a preferred event; as it is displaced 
right, paper shredding acquires negative properties. As the 
interburst interval distribution is displaced to the left, the 
paper shredding acquires negative properties; as it is displaced 
to the right, it is a preferred event. 


is 10 sec, and all instances observed during base line 
sessions fall between a minimum duration of 1 sec and 
a maximum duration of 20 sec. Looking now at the 
frequency distribution of interburst intervals, the 
most frequently observed interval between successive 
bursts of paper shredding is 60 sec (point A’ on the 
solid curve) with a minimum interval of 1 sec and a 
maximum interval of 120 sec. 

I suggest that a knowledge of these two distribu- 
tions, and of the variables which alter them, is essen- 
tial for predicting whether a particular contingent 
event will function as a reward or as a punishment in 
an instrumental contingency. If we consider, for ex- 
ample, paper shredding as a contingent event for 
instrumental running behavior, the base line observa- 
tions we have made of paper shredding tell us how 
the animal prefers to expose itself to paper shredding 
over the time permitted in the session. These base 
line observations should, of course, be made with run- 
ning also available, since the presence of the running 
response per se will very likely be one of the variables 
which would alter the burst duration and interburst 
interval distributions of paper shredding. Assume that 
we can regulate the exact amount of paper shredding 
which the gerbil is permitted for each completion of 
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the instrumental response requirement—specifically, 
that we can either give the gerbil access to the paper 
for durations shorter than its preferred (average) base 
line duration or force the gerbil to shred paper for 
durations longer than its preferred (average) base line 
burst duration. 

To the extent that the contingency which we ar- 
range between running and paper shredding shortens 
the intervals between successive bursts of paper shred- 
ding and/or lengthens the duration of a burst (rela- 
tive to base line durations), I would suggest that 
paper shredding will function as an aversive event. 

To the extent that the contingency we arrange 
lengthens the interval between successive bursts of 
paper shredding and/or shortens the bursts of paper 
shredding (relative to base line durations), I would 
suggest that paper shredding will function as a posi- 
tive event. Finally, if the contingency we arrange 
simply feeds paper shredding back to the gerbil in a 
way that does not alter his record of self-exposure (as 
manifested in the two distributions), we should ob- 
serve neutral motivational properties as evidenced by 
no change in the probability of instrumental be- 
havior. 

These general predictions made by the model can 
be made more specific by reference to Figure 4. Using 
these hypothetical data for paper shredding, it is sug- 
gested that given no change in the distribution of in- 
terburst intervals, paper shredding will function as a 
reward as long as the contingency we arrange dis- 
places the distribution of burst durations in the direc- 
tion of durations less than the optimal duration of 
10 sec as observed during base line sessions. Con- 
versely, assuming that we can force the behavior, 
paper shredding will function as an aversive cyent as 
long as the contingency we arrange displaces the dis- 
tribution of burst durations in the direction of dura- 
tions greater than the optimal duration of 10 secs (see 
the broken-line curves). 

Given no change in the distribution of burst dura- 
trons, paper shredding will function as a reward as 
long as the contingency we arrange displaces the dis- 
tribution of interburst intervals in the direction of 
intervals greater than the optimal interburst interval 
of 60 sec, as observed in base line sessions. Conversely, 
assuming that we can force the behavior, paper shred- 
ding will function as an aversive event as long as the 
contingency we arrange displaces the distribution of 
interburst intervals in the direction of intervals 
shorter than the optimal interburst interval of 60 sec. 

To digress briefly, it might be noted that the yoked 
control procedure, discussed earlier with reference to 
the experimental tests of Premack’s differential prob- 
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ability rules and the response deprivation hypothesis, 
should also be considered in the context of the present 
model. It would be interesting, for example, to ob- 
serve the effects upon other unconstrained responses 
in the gerbil’s repertoire of placing various temporal 
constraints upon paper shredding (changing the burst 
duration and interburst interval distributions with or 
without changes in total durations). It is possible that 
there would be systematic changes in other responses 
similar to those produced by arranging an instru- 
mental contingency. ‘These operations make some con- 
tact with the recent suggestions made by Baum (1973) 
concerning a correlation-based law of effect. 

Since the predictions made by this conceptual 
scheme are made with reference to the optimal dura- 
tions observed during base line sessions, the model 
will be called an optimal-duration model in future 
discussion. 

To conclude the discussion of the optimal-duration 
model, it is instructive to describe one additional 
hypothetical procedure which would pit the predic- 
tions made by this model against both Premack’s 
differential probability rules and the more recent re- 
sponse deprivation hypothesis. Because some of the 
predictions to be discussed require us to “force” the 
contingent response for durations which exceed the 
optimal duration, we shall consider a modification of 
the gerbil environment described earlier in which the 
animal now has access to only two sources of enjoy- 
ment. Assume that the gerbil is placed in a motorized 
activity wheel which can be activated by simply 
rotating the wheel 10 deg in either direction. In addi- 
tion, simply rearing up on the hind legs 15 sufhcient to 
produce an intracranial stimulus (ICS) in the lateral 
hypothalamic areca using shock parameters within the 
“positive” range. Dropping back on all four feet ter- 
munates the IGS, 

Assume also that halfehour unconstrained access to 
éach of thése événts révéals that the gérbil spénds a 
total of 10 min per session running in the wheel and a 
total of 5 min receiving the ICS, Consider the predic 
uions of the three models when we arrange some 
possible instrumental contingencies between these 
two responses. 

‘The predictions made by Premack’s differential 
probability rules are quite clear. Running is the more 
probable contingent event; hence a self-stimulate—to— 
run contingency (low-to-high probability) should pro- 
duce an increase in the amount of self-stimulation the 
animal will take. Conversely, a run—to-self-stimulate 
contingency (high-to-low) should produce a decrease 
in running. 

The predictions made by the response deprivation 
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hypothesis are very different. According to this posi- 
tion, either a self-stimulate—to—run or a run—to-self- 
stimulate contingency will produce increases in instru- 
mental responding if the contingency arranged 
reduces the amount of the contingent event below the 
base line level. Hence, contrary to Premack’s rules, the 
run—to-—self-stimulate contingency will, under response 
deprivation conditions, increase the probability of 
running. 

According to the optimal-duration model, either of 
the above contingencies (run-to-self-stimulate or self- 
stimulate-to-run) can be demonstrated to produce 
either the reward or the punishment outcome or no 
change depending upon how we distribute the con- 
tingent event in time. Consider, for example, a high- 
to-low contingency in which the animal is required to 
run in order to self-stimulate. Assume, also, for the 
moment that our base line observations indicated that 
the optimal duration for self-stimulation was 5 sec 
with an optimal-duration interburst interval of 30 sec 
between successive stimuli. If we now arrange the 
contingency in which each ICS is 5 sec in duration, 
the ICS will increase the probability of running as 
long as the interburst interval distribution is dis- 
placed in a direction of intervals greater than the 
optimal-duration interburst interval (60 sec). Con- 
versely, if the interburst interval produced by the 
contingency is shorter than the 60-sec optimal dura- 
tion, a 5-sec ICS will be aversive. On the other hand, 
if the contingency we arrange does not alter the inter- 
burst interval distribution, we can make the ICS 
aversive by displacing the distribution of burst dura- 
tions in a direction greater than the optimal duration 
of 5 sec, or we can make it positive by displacing the 
distribution of burst durations in a direction less than 
the 5-sec optimal duration. 

Finally, if one feeds back the ICS as a tape record- 
ing of the pattern revealed by the animal during the 
base line observations, it is predicted that the event 
will be motivationally neutral as indicated by no 
change in the probability of the instrumental re- 
sponse. 

These predictions differ from those made by Pre- 
mack in specifying that there are certain conditions in 
which a high-to-low probability contingency (as de- 
fined by total duration of base line responding) will 
produce an increase in the probability of the instru- 
mental response. ‘The predictions also differ from 
those made by the response deprivation hypothesis. 
The optimal-duration model does not require that 
the amount of contingent responding permitted dur- 
ing a session be suppressed below the base line total 
duration of that response. It requires instead that (a) 
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each instance of the contingent response be shorter 
than the optimal duration of base line bursts; and/or 
(b) the intervals between instances of contingent re- 
sponding be longer than the base line optimal dura- 
tion of interburst intervals. The total duration of 
contingent responding can remain the same as it was 
during base line sessions, and it is predicted that 
either of the above constraints on the contingent be- 
havior will be sufficient to produce an increase in 
instrumental responding. 

Although we do not yet have appropriate data to 
support the optimal-duration model, I would argue 
that it provides a viable alternative interpretation to 
the response deprivation hypothesis and has the ad- 
vantage of specifying the conditions under which we 
should observe increases, decreases, and no change in 
instrumental behavior. As stated earlier, the optimal- 
duration model is, in some sense, an attempt to trans- 
late Premack’s notion of “momentary probability” 
into operational terms. Basically, the two distributions 
described are assumed to provide a better basis than 
does the total duration of responding for predicting 
when a contingent event will be more or less probable 
than an instrumental event. In more recent work, 
Premack and his students appear to recognize the 
possible importance of the organism’s more momen- 
tary tendencies to turn a response off or on. In the 
punishment study by Terhune and Premack (1970) 
mentioned earlier the amount of suppression pro- 
duced by a less probable running response was demon- 
strated to be a linear function of the probability that 
the animal would turn off the running response at 
fixed times after its initiation. Obviously, the prob- 
ability that the organism will turn off a particular 
response some fixed time after its initiation is readily 
derived from the burst duration distribution, which is 
one component of the optimal-duration model. Hence 
the results reported by Terhune and Premack can be 
subsumed by both the optimal-duration model and 
Premack’s differential probability rules. 


BIOLOGICAL CONSTRAINTS ON 
REINFORCEMENT 


With the possible exception of autonomically in- 
nervated responses (cf. Miller, 1969), we have typically 
assumed that all responses in an organism’s repertoire 
are equally eligible for modification using an instru- 
mental learning procedure. Premack’s differential 
probability rules, the response deprivation hypothesis, 
and the optimal-duration model described in the first 
part of this chapter all reflect this assumption. These 
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positions specify the necessary and sufficient conditions 
for the observation of changes in instrumental be- 
havior and assume that these conditions will hold, in 
general, for any pair of responses in any particular 
species. 

Recently, however, evidence has been accumulating 
which suggests that certain responses may not be sus- 
ceptible to such modification while others are highly 
susceptible. ‘The general explanation which has 
emerged to account for such evidence is the notion 
that the organism brings certain “biological predis- 
positions” ito the instrumental learning situation 
which can either facilitate or inhibit the effectiveness 
of certain response-reinforcer combinations (cf. Selig- 
man, 1970). 

An example of the type of evidence which has pro- 
vided the basis for speculation about such biological 
constraints 1s Shettleworth’s (1973) recent work. She 
has demonstrated that food is an effective reinforce- 
ment for a number of responses emitted by the ham- 
ster, Mesocricetus auratus, but will not function as a 
reward for various grooming responses such as: face 
washing. Data such as these have at least two major 
implications in the context of the present discussion. 
First, they directly question the assumption that rein- 
forcers are transituational. If food is an effective re- 
ward for rearing responses made by hamsters, it 
should be an effective reward for face-washing re- 
sponses in the same species. Hence, along with Pre- 
mack’s model, the response deprivation hypothesis, 
and the optimal-duration model discussed earlier, the 
research concerned with biological constraints on 
instrumental learning runs counter (for very different 
reasons) to the contemporary tendency to accept the 
weak law of effect as a point of departure in our 
analysis of reinforcement. Second, an adequate theory 
of reinforcement will have to take account of any cyi- 
dence that there are certain response-reinforcer com- 
binations which are not effective (or vice versa) and 
will have to describe the conditions under which we 
can expect to observe such constraints. 

Consider, for example, Premack’s notion that a 
low-to-high probability contingency will produce an 
increase in the probability of instrumental respond- 
ing. ‘This assumption is directly questioned by Shettle- 
worth’s hamster data. Eating was more probable than 
either rearing or face washing. Nevertheless, eating 
would reinforce rearing but not face washing. Pre- 
mack’s position seems at the very least to require an 
additional assumption to subsume these discrepant 
cases. 

In the discussion which follows I shall critically 
examine some of the theory and data concerned with 
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the role of biological constraints in instrumental 
learning situations. I have attempted to restrict the 
discussion to the most recent data and to arguments 
specifically concerned with constraints upon instru- 
mental response-reinforcer combinations. For a more 
extensive discussion of earlier work and a broader 
perspective, the reader is urged to consult a number 
of different reviews already available in the literature 
(cf., Bolles, 1970; Garcia, McGowan, & Green, 1972: 
Moore, 1971, 1973; Seligman, 1970; Shettleworth, 
1972, 1973; Staddon & Simmelhag, 1971). 

The various attempts to explain why particular 
response-reinforcer combinations are effective while 
others are not are, in my opinion, all variations on a 
common theme. The general theme is that an orga- 
nism has certain species-typical behavior patterns which 
it exhibits under certain conditions in its natural 
habitat. ‘To the extent that a particular response-rein- 
forcer combination requires behavior which is con- 
sistent with the organism’s natural behavior, the rein- 
forcer will be very effective in controlling responding; 
to the extent that the combination is inconsistent with 
the organism’s natural behavior, the reinforcer will be 
ineffective. I shall consider several variations on this 
theme which are to varying degrees successful in trans- 
lating this idea into a testable proposition. 


The “Functional Relevance” Hypothesis 


The basic notion is that certain responses are func- 
tionally relevant in the context of certain reinforcers 
but not in the context of other reinforcers. For 
example, in hamsters, digeing responses are usually 
observed as a part of procuring food, but not seen in 
situations involving sexual behaviors. The functional 
relevance hypothesis suggests that a sexually receptive 
female would not be particularly effective as a rein- 
forcer for digging, since diggitig is not particularly 
relevant in the organism's species-typical reactions to 
sexual stimuli. 

‘Three representative versions of the “functional 
relevance” hypothesis have emerged in the context of: 
(1) Shettleworth’s work with instrumental reward pro- 
cedures; (2) work by Walters and Glazer (1971) and 
Melvin and Ervey (1973) with instrumental punish- 
ment procedures; and (3) Bolles’s (1970) work with 
avoidance learning procedures. 

Shettleworth (1973) observed golden hamsters in a 
chamber similar to an open field apparatus which 
had a sand floor, a magazine for delivering food pel- 
lets, and a typical response lever. A number of species- 
typical behavior patterns were measured: (1) face 
washing, (2) digging, (3) open rearing, (4) “scrab- 
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bling” (which is a type of digging movement with the 
forelegs on the side walls), and (5) bar pressing (shap- 
ing was necessary). After reducing the animals to 80% 
of their ad lib weights, different groups were exposed 
to five possible instrumental contingencies with food 
as the contingent event (e.g., scrabble to eat, face-wash 
to eat, etc.). Contingent upon performing the appro- 
priate instrumental response for a period of .5 sec, 
the food pellets were delivered on a variable-interval 
(VI) 20-sec schedule (following initial sessions of con- 
tinuous reinforcement). The results demonstrated con- 
vincingly that the hamsters increased the probability 
of four of the five instrumental responses. Face wash- 
ing was the only instrumental response which could 
not be brought under the control of the reinforce- 
ment contingency, even under extended training con- 
ditions with continuous reinforcement. 

To account for these results, Shettleworth suggested 
that face washing was perhaps the only response of the 
five selected which was not a part of the animal’s nat- 
ural behavioral sequence in obtaining food. There 
are, however, a number of alternative explanations. 
First, as Shettleworth recognizes, the hamster needs a 
supply of saliva in order to lick its paws for face wash- 
ing. It is possible that the use of the dry food pellets 
acts to decrease the amount of face washing by using 
up excess saliva for food mastication. Or the rate of 
saliva production per se may simply be inadequate for 
the maintenance of high rates of face washing; i.e., 
this may be a trivial instance of requiring an animal 
to perform a response which is beyond its physical 
capability (like training a rat to fly). As Shettleworth 
suggests, it would be interesting to repeat these experi- 
ments with water as the reinforcer for thirsty hamsters, 
thus eliminating the problem of a “dry mouth.” 

There is another explanation for these data that is 
of particular interest because it has general implica- 
tions for research of the type reported by Shettle- 
worth. One of the several effects of arranging an in- 
strumental contingency is that the contingent event 
adds a response to the organism’s repertoire in the 
experimental situation. The food used in Shettle- 
worth’s experiments produced a certain amount of 
eating behavior. The introduction of this response to 
the organism’s repertoire can, in and of itself, facil- 
itate or inhibit other instrumental responding. An 
extreme case of this can be seen in Figure 5, which 
presents the results of a simple experiment conducted 
in our laboratory. 

Subject A was a hungry gerbil given free access to 
three items in an experimental chamber for 1 hr each 
day. ‘he items were a running wheel, a box of sun- 
flower seeds, and a strip of bristol board for shredding. 
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Fig. 5. Paper-shredding and eating response probability as a 
function of the presence and absence of a running wheel in 
the environment. 


The animal spent 80% of the session running in the 
wheel, a small amount of time eating seeds, and no 
time at all shredding the bristol board. In Phase 2 
gerbil A was denied access to the running wheel, and 
the gerbil added paper shredding to the response reper- 
toire and increased the amount of time spent eating. 
Subject B was exposed to the same conditions in 
reverse order. During Phase 1 the wheel was not avail- 
able, and paper shredding was the most probable 
response in the response repertoire. In Phase 2 the 
addition of the wheel to the chamber reduced and 
eventually eliminated paper shredding, and the ani- 
mal spent 70-80% of the session running. 

Thus the addition of running to the response 
repertoire of the gerbil eliminates paper shredding. 
The results in general suggest that the simple addition 
or deletion of a response from the repertoire can have 
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profound effects upon the probability of other exist- 
ing responses. Although the simple experiment has 
yet to be completed, it would not be surprising to find 
that running is not a terribly effective reward for 
paper shredding, since the former has direct inhib- 
itory effects upon the latter when the two are observed 
together in the complete absence of the instrumental 
contingency operation. 

It is possible that the basis for predicting the effec- 
tiveness of particular response-reinforcer combinations 
is an understanding of the conditions under which 
interactions occur between responses in the absence of 
the contingency. The implication of these observa- 
tions is that a control condition should always be em- 
ployed in research of the type reported by Shettle- 
worth in which we measure the probability of the 
instrumental response in the simple presence and 
absence of the contingent response prior to arranging 
any contingency between the two. The reader might 
note that the rationale for this control is identical to 
the rationale proposed for the use of a yoked control 
procedure in the context of Premack’s position (p. 
106). 

Shettleworth gives some attention to this problem 
of direct inhibitory effects in preliminary work which 
assessed the effects of both food deprivation and pellet 
delivery upon the various behaviors measured in the 
experimental situation. Her observations suggest that 
such inhibitory interactions among responses might 
account for the results. For example, she notes that 
food deprivation (which presumably increased the 
probability of eating) appeared to inhibit face-wash- 
ing activity, with little effect upon the other responses. 
It was not clear from her results if the simple addition 
of the food pellets per se had any effects upon the 
alternative responses. 

The counterpart to Shettleworth’s work with “ap- 
petitive” response-reinforcer combinations is some 
recent work concerned with the possible biological con- 
straints upon “aversive” résponse-reinforcer combina- 
tions. Walters and Glazer (1971), for example, suggest 
that the organism’s species-typical responses to avér- 
sive stimulation may not be susceptible to the usual 
suppressive effects of response-contingent punishment, 
They observed the behavior of Mongolian gerbils in a 
large chamber with a sand floor, Two responses, alert 
posturing and digging in the sand, were selected for 
study from the gerbil’s repertoire. The rationale for 
selecting these two behaviors was as follows: 


Given the different biological functions of these 
behaviors, it was felt that they would be differ- 
entially affected by punishment. Specifically, we 
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expected digging to be relatively easily sup- 
pressed by punishment, while alert posturing, 
since it itself is a reaction to sudden or aversive 
stimuli, was expected to show little suppressive 
effect. (p. 332) 


Following their base line observations of digging 
and alert posturing, and a series of habituation ses- 
sions during which a tone was made contingent upon 
either alert posturing (eight gerbils) or digoing (eight 
gerbils), the gerbils were split into four groups. Two 
groups received daily classical conditioning sessions 
in which a tone was paired with a relatively strong 
shock (2 mA, I sec). These conditioning sessions were 
conducted inside a plexiglass chamber with a grid 
floor which was placed inside the larger sand chamber. 
Two control groups received identical treatment with- 
out the shock being administered. Following each 
daily conditioning session, the gerbils were placed 
back in the experimental chamber with the sand floor 
and the tone was made contingent upon the digging 
response (8 gerbils) or upon posturing responses (eight 
gerbils). When compared to control subjects which 
had not received the tone-shock pairings, digging was 
suppressed by the presentation of the conditioned 
punisher (tone) contingent upon digging. However, 
alert posturing was facilitated by presentation of the 
conditioned punisher. Walters and Glazer conclude 
that the differential effect of the conditioned punisher 
can best be understood in terms of the differential bio- 
logical functions served by these two responses in the 
gerbil’s natural environment, Digging is presumably 
associated with nest building and food gathering, and 
alert- posturing with the animal's defensive reactions, 

There is. however, a plausible alternative explana- 
tion for these résults. We have observed, informally, 
that these desert rodents appear to reduce the aversive: 
ness of grid shocks BY rocking back on the insulating 
fur of they hind legs inte a position which very much 
resembles alert posturing a& described by Waltere and 
Clazer. Hence, it sééms pessible that the animals sca- 
posed to the asic conditioning procedure might 
have learned this type of alert-posturing response as 
a preparatory response to the tone in anticipation of 
the impending shock. During subsequent punishment 
training, the gerbil’s reaction to the tone would then 
be the conditioned “posturing” response. This would 
also account for their observation that animals pun- 
ished for digging also increased the amount of alert 
posturing during suppression of the digging response. 

Presumably, if digging could have served as a 
preparatory response during classical conditioning 
sessions, results exactly opposite to those reported by 
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Walters and Glazer could have been obtained. Any 
reference to the presumed biological function of the 
response in the animal’s natural habitat would be 
unjustified. 

Until appropriate controls are added to determine 
if such preparatory behaviors were acquired during 
the classical conditioning phase of the study, one can- 
not accept the interpretation offered by Walters and 
Glazer. 

A second, more recent experiment, which claims to 
have demonstrated that the organism’s species-typical 
reactions to danger or threat are not susceptible to the 
usual suppressive effects of punishment, was reported 
by Melvin and Ervey (1973). ‘These authors punished 
the gill-extension response which is part of the aggres- 
sive display of Siamese fighting fish (Betta splendens) 
in response to the presence of a conspecific (or mirror 
image). In this experiment each fish was exposed to 
its mirror image for 60 consecutive trials, each 2 min 
in duration, with an intertrial interval of 40-60 sec. Us- 
ing this basic procedure the fish were randomly assigned 
to one of five different treatment conditions. Group HC 
was a habituation control to monitor the decline in 
responding which is typically observed over trials 
with this procedure; two groups, 7E and 7L, received 
a 7-V electric shock as the punishing stimulus 
either early in training (trials 16-30), or late in train- 
ing (trials 45-60). Similarly, two groups, 13E and 13L, 
received a 13-V electric shock either early or late in 
training. ‘The results indicated that the intense 13- 
V_ shock suppressed gill extension whether _ pre- 
sented during early or late trials. The 7-V mild 
shock, however, did not suppress the gill extension 
when presented during early trials and facilitated gill 
extension responses when presented during the late 
trials. Suppression or facilitation was always mea- 
sured relative to the performance of the habituation 
controls, which revealed a steady decline over the 
course of training. 

Melvin and Ervey suggest that the facilitation of 
gill extension in the 7L group indicates that species- 
typical ageressive behaviors are perhaps not suscepti- 
ble to the usual suppressive effects of mild punish- 
ment. Again, however, a number of problems with the 
experiment suggest that the conclusion is premature. 

First, a control is required in which a referent re- 
sponse (with the same operant rate) is used that does 
not have the unique biological function, It is possible 
that Melvin and Ervey would have observed the same 
results using sexual or feeding behavior, indicating 
that the results were not unique to the presumed 
function of the response. 

Second, it seems possible that the same results 
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might be obtained by simply presenting a novel stim- 
ulus to the 7L group instead of the mild electric shock 
which was assumed to be aversive. Brimer (1970) has 
demonstrated that the effects of a novel stimulus upon 
an ongoing response are dependent upon the operant 
level of that response. For a given response, the pres- 
entation of a novel stimulus will reduce the rate of 
responding if that rate is very high or increase (dis- 
inhibit) the rate of responding if the base line re- 
sponse is very low. Since gill extension had habituated 
to about half its original probability when the 7L 
group received the mild shock, it is possible that the 
increase was an instance of disinhibition similar to 
that described by Brimer. A novel stimulus control 
group would presumably answer this question. 

In view of the problems with the preceding experi- 
ments and with a number of other experiments in the 
literature which suggest that elicited defensive or 
aggressive reactions can be suppressed by response- 
contingent punishment (cf. Azrin, 1970; Baenninger 
& Grossman, 1969), the view that the animal’s defen- 
sive response repertoire is refractory to the effects of 
punishment contingencies seems questionable at this 
point. 

Having considered the recent evidence which ar- 
gues for important biological constraints in the con- 
text of reward and punishment contingencies, we 
can now turn to the third variation on the basic 
theme which occurs in the context of avoidance learn- 
ing. Bolles (1970) has been most explicit in developing 
the notion of biological constraints on avoidance 
learning. His hypothesis, which is referred to as the 
species-specific defense reaction (SSDR) hypothesis, 
states: 


For an [avoidance response] to be rapidly 
learned in a given situation, the response must 
be an effective SSDR in that situation and when 
rapid learning does occur, it is primarily due to 
the suppression of ineffective SSDR’s, (p. 35) 


Bolles uses this argument effectively to explain why 
certain responses such as lever pressing are not readily 
acquired as avoidance responses (cf. Meyer, Cho, & 
Wesemann, 1960) and why, when such responses are 
readily trained, they are observed to be modified 
forms of defensive reactions such as freezing (cf. Bolles 
& McGillis, 1968). 

The primary evidence cited in support of Bolles’s 
hypothesis was an experiment in which rats were 
placed in a running wheel and required to make one 
of three different responses in order to avoid a sig- 
naled shock (Bolles, 1969). The responses were rearing 
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on the hind legs, a 180-deg about-face, or a 90-deg 
revolution of the wheel. The results clearly indicate 
that the running response was more rapidly acquired 
than the turning response, and that rearing was not 
increased at all by the avoidance contingency. This 
conclusion held true whether running was also re- 
quired as the escape response or whether some alterna- 
tive behavior was required as the escape response. 
Bolles concludes from these results that the speed of 
learning an avoidance response can be related to the 
animal’s species-specific defense reactions, and that 
running, as a part of the animal’s natural reactions to 
aversive stimuli, will be rapidly acquired relative to 
responses like turning and rearing which are not pri- 
marily defensive reactions. 

More recent experiments reported by Grossen and 
Kelley (1972) lead to a similar conclusion. These au- 
thors observed that rats placed in a large chamber 
tend to spend more time in contact with the walls of 
the chamber (thigmotaxis) and more time in a “freez- 
ing” posture when exposed to intermittent electric 
shock from a grid floor than they do under appeti- 
tively motivated conditions. They suggested that this 
thigmotactic behavior is one of the rat’s species-specific 
defense reactions and should, according to Bolles, be 
acquired very rapidly as an avoidance response. In a 
subsequent procedure using the same large chamber, 
two platforms were placed in the chamber; one 
around the perimeter of the chamber and one in the 
center of the chamber. Three groups of rats were 
assigned to one of three avoidance learning condi- 
tions; Group C only had access to the center platform 
as a place to avoid or escape the shock; Group P had 
access to the perimeter platform; and Group PC had 
access to both. In 30 subsequent acquisition trials in 
which the animals were placed in the chamber and 
had to jump on one of the platforms to escape or 
avoid shock, Group P made 779% avoidance responses, 
Group © 57%, and Group PG 72%, with 92% of the 
responses going to the perimeter platform. In a simi- 
lar control experiment, three groups of rats were 
required to climb either the center or perimeter plat- 
forms to obtain food pellets or given a choice of plat- 
forms. No differences in acquisition performance were 
observed, and no systematic preference between plat- 
forms was observed. 

Although the results are suggested to support 
Bolles’s SSDR_ hypothesis, there is an equally obvious 
alternative which Grossen and Kelly recognize. If one 
simply assumes that shock increases the operant rate 
(probability of contact) with the perimeter platform, 
the different rates of acquisition can be considered to 
reflect differences in the initial operant rate of the 
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two responses. It would be of some interest to repli- 
cate the experiment reported by Grossen and Kelley 
and use a forced-choice procedure in the early acqul- 
sition trials to make sure operant rates to each plat- 
form were equated, then give the animals a choice of 
the center vs. the perimeter to see which is pre- 
ferred. 

There are at least two general criticisms of Bolles’s 
(1970) position which should be noted. Consider a 
hypothetical case of an animal with a repertoire of 
two responses: A and B. If an aversive stimulus is 
now introduced, typically a third response (or group 
of responses) will begin to occur which Bolles would 
call the animal’s species-specific defense reaction(s). 
According to Bolles, the SSDR will be acquired more 
rapidly as an avoidance response than either response 
A or response B. I would suggest that this might often 
be observed to be the case, but only because the SSDR 
has a unique advantage over responses A and B, The 
SSDR is sequentially dependent upon the aversive 
stimulus and as such predicts the absence of that aver- 
sive stimulus for a longer period of time than any 
other response in the animal’s repertoire. This im- 
plicit avoidance contingency provides an alternative 
explanation for the observation that shock-elicited 
behaviors are often observed to develop rapidly as 
avoidance responses—an alternative, that is, to the 
notion that these responses enjoy a special status as 
members of the category known as the animal's natural 
defense reactions. A more extensive analysis of this 
implicit contingency and its implications for avoidance 
learning has already been considered elsewhere and 
need not be repeated here (cf. Dunham, 1971), 

A second and perhaps more important criticism of 
Bolles’s position can also be dirécted at the other vari- 
ations on the “functional relevance hypothesis” which 
I have considered on the preceding pages, As stated at 
present, mest yersxons ef the functional relevance 
hypothesis are very much ad hoc propocitione. There 
ig 116 clearly défined criterion for determining, inde- 
pendent of the rate of learning, whether a particular 
response is or is not “functionally relevant.” It is not 
unusual to find various proponents of the notion 
elaborating at length about how a particular operant 
might be considered a short-term mutation of behav- 
ior in the wild. For example, in discussing the differ- 
ential rates at which rats acquire an avoidance re- 
sponse in the shuttlebox and in an activity wheel, 
Bolles (1970) states: 


It cannot be argued that running in the wheel 
constitutes an effective SSDR while running in 
the shuttlebox is marginally effective merely 
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because the former is more rapidly acquired 
than the latter. From the viewpoint of the SSDR 
hypothesis, both situations are ambiguous in 
that they permit only limited or compensated 
flight. The running wheel has been recognized as 
a peculiar piece of apparatus by many investi- 
gators who have used it in general activity 
studies, however, and perhaps it does permit 
the rat to get away in some meaningful sense. 


(p. 38) 


Until less arbitrary, independent criteria for deter- 
mining membership in various response categories 1s 
proposed, it will not be possible to conduct critical 
tests of the suggestion that certain response-reinforcer 
combinations are constrained because the response is 
not “functionally relevant” in the context of the 
reinforcer. 

As stated earlier, the same problem is characteristic 
of the other versions of the functional relevance 
hypothesis which have been considered. For example, 
Shettleworth suggests that face washing is not a func- 
tionally relevant response in the context of food 
gathering, but that rearing, digging, scrabbling, and 
lever pressing are. To argue, alternatively, that rear- 
ing could be considered a part of the animal’s de- 
fense reactions and that bar pressing is not particu- 
larly concerned with food gathering would simply beg 
the question. What is needed if the “functional rele- 
vance” hypothesis is going to be testable is an inde- 
pendent criterion for defining “functional relevance.” 

There are, in my opinion, two recent developments 
in this area of research which escape, to some extent, 
this problem of ad hoc reference to the biological 
significance of the response and provide us with some 
independent criteria for predicting whether or not a 
particular response-reinforcer combination will be 
effective. Black and Young (1972) working with rats 
and avoidance responding have suggested what might 
be called a systems constraint hypothesis. Moore (1971, 
1973) working with pigeons and an autoshaping pro- 
cedure, and Sevenster (1973) working with stickle- 
backs, have both proposed what might be called a 
Paviovian hypothesis. In the remaining discussion I 
shall briefly consider each of these two hypotheses and 
the associated data. 


The “Systems Constraint” Hypothesis 


Black and Young (1972) recently reported an inter- 
esting experiment in which one group of thirsty rats 
was trained to bar-press for food during presentation 
of one discriminative stimulus (SP) and to lick a 
drinking tube to avoid shocks during a second S?. 
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The animals learned the appropriate responses in or- 
der to obtain food and avoid the shocks. Following 
the period of initial training, the rats were satiated 
with water prior to some training sessions. On these 
days, the licking avoidance behavior dropped out and 
the animal took a substantial number of shocks, while 
lever pressing for food was maintained at its previous 
level. Black and Young interpret these results in terms 
of system constraints. Specifically, they suggest that 
the normal causal factors of a response (e.g., the wa- 
ter regulatory system) may place constraints on the 
extent to which that response can be brought under 
the control of a different set of factors (e.g., the 
avoidance contingency). If the normal causal factors 
for drinking are present (e.g., deprivation), the animal 
will learn to drink to avoid shocks; if the normal 
causal factors are not there, drinking will not be con- 
trolled by the avoidance contingency. Bar pressing is 
assumed not to be under any such system constraints. 
To quote Black and Young: 


It would seem, then, that one dimension along 
which responses might be classified is the degree 
to which they are constrained from _ being 
changed by operant reinforcement by the prop- 
perties of the regulatory systems of which they 
are a part. The criterion for classifying responses 
is not so much the conditionability of the re- 
sponse or its ease of conditioning under optimal 
circumstances, but rather the limitations on such 
conditioning. In this sense, bar pressing might 
be described as a better operant than drinking 
because it is less constrained by the regulatory 
control system of which it is a part then is drink- 


ing. (p. 44) 


Stated as such, the proposition suggested by Black 
and Young is as ad hoc as some of those discussed 
earlier and differs very little from the functional rele- 
vance hypothesis. There is no criterion specified 
which would permit us to know, a priori, whether or 
not the regulatory system of a response will place 
constraints upon its use as an operant. Consider, for 
example, trying to predict whether a gerbil’s running 
behavior or paper-shredding behavior would be sub- 
ject to serious “system constraints’ of the sort ob- 
served when drinking was used as an avoidance 
behavior. It is not possible, from the preceding com- 
ments, to determine if these responses are subject to 
such constraints independent of the situation in which 
the constraints are observed to operate. 

Black and Young do, however, recognize the prob- 
lem and offer a partial escape from it by making use 
of Vanderwolf’s (1969) observation that some phasic 
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skeletal movements are accompanied by dorsal hippo- 
campal theta waves while others are not. Responses 
typically labeled “instinctive” or ‘‘consummatory” 
(eating, drinking, etc.) are not accompanied by theta 
activity. Black and Young suggest that responses 
which are not accompanied by theta waves from the 
dorsal hippocampus will be constrained as operants 
by their normal regulatory system; responses which 
are accompanied by theta activity will not be con- 
strained. 

Hence we are now in a position to make some 
testable predictions and, hopefully, accept or reject 
the “system constraints” hypothesis. For example, if 
theta waves do not accompany paper-shredding be- 
havior in the gerbil, its use as an avoidance operant 
should be constrained in Black and Young’s study. 
Specifically, if paper shredding is established as a 
Sidman avoidance response, satiation of the paper- 
shredding behavior should reduce avoidance respond- 
ing. If it does not, we can reject the theta wave cri- 
terion and the system constraints hypothesis, unless 
someone wants to argue about the “normal” causal 
factors for paper shredding in gerbils. To my knowl- 
edge, such an experimental test has not been con- 
ducted. 


The Pavlovian Hypothesis 


Another formulation which generates a number of 
testable predictions is the Pavlovian hypothesis. Moore 
(1971, 1973), working primarily in the context of the 
“autoshaping phenomenon” reported by Brown and 
Jenkins (1968), suggests that most instrumental learn- 
ing situations also contain the necessary ingredients 
for the emergence of classically conditioned responses 
and that such implicit conditioning can either facilt- 
tate or inhibit the effectiveness of certain response- 
reinforcer combinations. 

The manner in which this Pavlovian conditioning 
mechanism is suggested to operate is best explained 
by direct reterence to some of the data which argue, 
quite convincingly, for its existence. Brown and 
Jenkins (1968) reported that repeated pairings of an 
illuminated response key and the presentation of 
grain were sufficient to cause pigeons to peck the key 
when it was illuminated. As Moore (1973) suggests, 
this phenomenon, called autoshaping, is potentially 
important because it indicates that a standard Pav- 
lovian conditioning procedure “could generate the 
key-pecking response so often used in operant condi- 
tioning research” (p. 160). Moore presents a varied 
array of data which argue collectively that the instru- 
mental key peck is primarily the product of a Pav- 
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lovian process. It will be possible to consider only the 
most direct arguments in the present discussion. A 
series of well-controlled experiments done in col- 
laboration with Jenkins (cf. Jenkins & Moore, 1973) 
are perhaps most convincing. These authors demon- 
strated that birds concurrently autoshaped with grain 
and water reinforcement “pecked” the key stimulus 
associated with grain and “drank” the key stimulus 
associated with water reinforcement. These results, 
which were observed when birds were both hungry 
and thirsty, indicate that an organism develops a 
conditioned response to the manipulandum which is 
similar in topography to the response elicited by the 
reinforcer. 

David Rackham, a graduate student in Moore’s 
laboratory, conducted similar experiments (Rackham, 
1971) in which a light stimulus was paired with access 
to a sexually receptive female pigeon. Although the 
controls for nonassociative effects were less extensive 
than in the food and water experiments, the results 
indicated that the topography of the response to the 
light was similar to the topography of the response to 
the female. Specifically, the male bird tended to 
“court” the visual stimulus. 

In the context of the autoshaping procedure, per- 
haps the most direct rationale for determining if the 
instrumental response emerges from a Pavlovian proc 
€55s, aS Opposed to adventitious reinforcement of 
arbitrary, emitted operants, is to use an omission 
training procedure in which any pecks made during 
the presentation of the conditioned stimulus actually 
cancel the presentation of grain on that trial. If the 
pecking actually develops from the Pavlovian con- 
tingency, the animal should persist in pecking the 
conditioned stimulus at rates appropriate to a partial- 
reinforcement schedule. If the pecking develops from 
the adventitious reinforcement of emitted skeletal 
movements, the omission training contingency pre- 
vents such reinforcement, and pecking should not 
develop. Results of an omission training experiment 
conducted by Audrey Kirby (1968), another of Moore’s 
students, and by Williams and Williams (1969) indi- 
cate that the omission procedure does not stop the 
development of pecking in the autoshaping procedure, 
hence supporting the Pavlovian mechanism. 

Of several implications of Moore’s work, the one 
which is most directly relevant in the context of the 
present discussion is that such Pavlovian conditioning 
would be expected to make certain response-reinforcer 
combinations very effective, while others would not 
work very well at all. Specifically, if the instrumental 
response which we require of the organism is com- 
patible with the species-typical reaction (uncondi- 
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tioned response) elicited by the reinforcer, the rein- 
forcer should very rapidly gain control over the 
instrumental behavior. If, alternatively, the instru- 
mental response we require is not compatible with the 
species-typical reaction elicited by the reinforcer, the 
reinforcer should not be very effective in gaining con- 
trol over the instrumental behavior. Hence if we can 
specify what the unconditioned response to a par- 
ticular reinforcer will be and determine if that re- 
sponse is incompatible with the instrumental response, 
it is possible to predict whether a particular response- 
reinforcer combination will be effective or not. Moore 
(1971, 1973) describes a number of examples of effec- 
tive and ineffective response-reinforcer combinations 
from both the ethology literature and the operant 
conditioning literature which support his contention. 
Pigeons, for example, can easily be trained to peck a 
key for food, since the unconditioned response to 
grain (pecking) is perfectly compatible with the in- 
strumental response (pecking). Alternatively, a num- 
ber of examples indicate that it is very difficult to 
train a bird to peck a key in order to avoid shock (cf. 
Hoffman & Fleshler, 1969). When key pecking is de- 
veloped with any impressive reliability as an avoidance 
response, either elaborate shaping (see Perrari, ‘Todo- 
roff, & Graeff, 1973) or an extensive history of food 
reinforced pecking (see Foree & LoLordo, 1974) ap- 
pears to be necessary. In defense of the Pavlovian 
hypothesis, Moore (1973, p. 172) has argued _per- 
suasively that many instances of key-pecking avoidance 
might be interpreted as elicited aggressive reactions to 
the aversive stimuli. 

Bolles’s SSDR hypothesis, considered earlier, 1s 
similar in spirit to Moore’s Pavlovian mechanism 
when the latter is considered in the context of 
avoidance learning. Bolles’s SSDRs are, in effect, un- 
conditioned responses to aversive stimuli. ‘The essen- 
tial difference between the two positions with refer- 
ence to avoidance learning is that Moore’s Pavlovian 
mechanism is more precisely specified and testable, 
and vulnerable for that reason. To the extent that it 
becomes difficult to specify, a priori, in a given situa- 
tion what the unconditioned response and the rele- 
vant conditioned stimuli are, Moore’s Pavlovian mech- 
anism could also become an ad hoc proposition which 
is difficult to disprove. 

In addition to the various evidence reviewed by 
Moore in support of the Pavlovian analysis, several 
more recent experiments have appeared which are 
directly relevant to the suggestion that this mechanism 
can facilitate or inhibit the effectiveness of certain 
response-reinforcer combinations. 
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Peterson, Ackil, Frommer, and Hearst (1972) paired 
a 15-sec presentation of a retractable lever with either 
food reinforcement or intracranial stimulation (ICS) 
in two different groups of rats. A second retractable 
lever in the same chamber was presented randomly 
without any relationship to the other events in the 
chamber. A measure of the number of contacts with 
the lever indicated that the rats, like the pigeons im 
Brown and Jenkins’s (1968) experiment, learned to 
contact the lever if it signaled food or ICS and paid 
little attention to the random lever. Of more interest, 
videotaped records of the rats’ behavior revealed 
qualitative differences in the response made to the 
lever by subjects receiving food and the response made 
to the lever by subjects receiving ICS. According to 
Peterson et al., in the group receiving the food reward 
“visual observations and the videotaped records re- 
vealed that contacts of CS+ were almost exclusively 
oral and consisted mainly of licking responses and 
gnawing behavior’ (p. 1010). Alternatively, in the 
group receiving the ICS reward the rats typically 
showed sniffing and exploratory behavior of the same 
type as elicited by the ICS alone. These results di- 
rectly support the main implication of Moore’s Pav- 
lovian analysis—that the unconditioned response to 
the reinforcer determines the topography of the re- 
sponse to stimuli paired with that reinforcer. 

A second, more recent experiment reported by 
Wasserman (1973) appears, however, to pose some 
problems for Moore’s Pavlovian mechanism. Wasser- 
man used 3-day-old chicks as subjects and an auto- 
shaping procedure which consisted of 50 daily 
pairings of an 8-sec green key light with the 4-sec activa- 
tion of a heat lamp located in the ceiling of a small 
chamber. The chicks, which were placed in the cold 
chamber, pecked the green response key on 80% of 
the trials, as compared to a group presented with the 
stimuli using Rescorla’s (1967) random control pro- 
cedure. When the experimental group was subse- 
quently switched to the random control condition, the 
pecking slowly extinguished. When the random group 
was switched to paired stimulus conditions, they 
started to peck and reached an asymptote of pecking 
on 40%, of the trials. 

In order to determine if the emergence of key 
pecking in this situation was the result of Pavlovian 
as compared to instrumental contingencies, chicks 
were trained using an omission procedure in which 
pecks on the key would cancel the forthcoming heat 
lamp. The results, similar to those reported by Kirby 
(1968) and described earlier, indicated that all chicks 
did peck the key at reduced rates under the omission 
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training contingency. ‘The omission training results 
provide strong support for Moore’s suggestion that 
autoshaping phenomena are under Pavlovian rather 
than instrumental control. A problem arises, however, 
with the implication of Moore’s analysis, which re- 
quires that the unconditioned response to the rein- 
forcer will determine the topography of the response 
to the operant manipulandum. In Wasserman’s pro- 
cedure, the response to the heat lamp was suggested 
to be a species-typical wing-extension response, yet the 
initial autoshaped response was to peck the light. 
Wasserman notes that the pecking response was ob- 
served during early trials and eventually gave way to 
a response which he described as “snuggling up to the 
response key.”’ It would appear then that Wasserman 
is dealing with a Pavlovian phenomenon, but one in 
which the conditioned and unconditioned responses 
are completely different in topography. 

Some recent work appears to reconcile Wasserman’s 
results with Moore’s Pavlovian analysis. Hogan (1974) 
observed that chicks placed in a cold environment 
often peck at their mother, causing the adult bird to 
move closer, providing a source of heat. Hence peck- 
ing at warm objects is one of a number of uncondi- 
tioned responses to the cold environment employed 
by Wasserman which might be expected to emerge 
from the autoshaping procedure. 

Although there is a substantial amount of support 
for Moore's suggestion, a potential problem with his 
analysis is hinted at by the difficulty encountered try- 
ing to interpret Wasserman’s data. When a bug enters 
the visual field of a hunery frog, the unconditioned 
response (UR) seems to be the result of some precise 
and hard-wired circuitry between visual input and 
tongue. Hence there is no basis for confusion when 
attempting to specify the UR a priori, or in recopgniz- 
ing what responses are compatible or incompatible 
with that tongue flip. Higher up the phylogenetic 
scale, however, that “hard wiring” appears to disap- 
pear in a good many cases with the net result that 
the UR to a given stimuluys is much less predictable, 
Hence a strict application of Moore’s analysis in the 
case of higher organisms would require, as he has in- 
dicated, a more extensive analysis of what factors 
determine which of several possible URs will emerge 
in a given situation. 

The final position to be considered was proposed 
by Sevenster (1973) and is very similar to Moore’s 
Pavlovian mechanism. In initial experimentation Sev- 
enster observed that a male stickleback (Gasterosteus 
aculeatus) would readily learn to swim through a ring 
or bite the top of a rod in order to be able to attack 
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another male stickleback displayed at the end of an 
aquarium. Both of these instrumental responses were 
performed at very high rates with short interresponse 
intervals. When a sexually ripe female was used as the 
reinforcer, however, a male stickleback again learned 
how to swim through a ring at high stable rates, but 
revealed very slow and variable rod-biting perform- 
ance. It appeared from these data that some type of 
constraint was operating which prevented rod-biting 
behavior from developing when courting a female was 
the reinforcer. 

In subsequent experiments, Sevenster analyzed in- 
terresponse times when the courtship reinforcer was 
presented on a variable-ratio schedule for rod biting. 
The results indicated that the interval between rod- 
biting responses got shorter as the interval of time 
since the last reward got longer. The longest inter- 
response interval was observed to be immediately 
after the opportunity to court. Taken together, the 
results indicate that some property of courtship be- 
havior actually inhibits (temporarily) the ability of 
the animal to perform the biting response. 

In order to determine if the actual pairing of the 
rod biting with the courtship response was necessary 
for such inhibition to be observed, Sevenster con- 
ducted yet another experiment in which a male fish 
was trained to bite thé rod in order to attack another 
male. As in earlier experiments, the response was rap- 
idly acquired and performed at a high rate. Once the 
fish was biting the red at high rates, a sexually ripe 
female was introduced into the tank for brief periods 
of courtship, with care taken not to associate the rod 
with the courtship response. A comparison of the 
latency between the opportunity to court and the neut 
biting response and the opperivnity te fight and the 
next biting response indicated that the opportunity to 
court had little effect upon the lateney is the nout 
bite, The results ef this sxpermment (which used 
only one subject and should be extended) indicate 
that the inhibitory effect which courtship hag iipon 
biting behavior depends upon the prior temporal 
pairings of the operant manipulandum with the sexu- 
ally ripe fcmalc, 

Of equal interest are Sevenster’s anecdotal observa- 
tions of what the fish were actually doing during the 
long interresponse intervals following presentation of 
the sexually ripe female. He observed: 


‘The fish would approach the rod and start cir- 
cling around it with zig-zag like jumps and often 
with open mouth, sometimes making snapping 
movements at the tip or softly touching it. (p. 


277) 
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In other words, the fish was treating the rod tip like 
a “dummy female,” and the behavior during the long 
interresponse intervals was the species-typical court- 
ship ritual. 

Sevenster’s interpretation of these results is identi- 
cal, in some respects, to Moore’s suggested Pavlovian 
process. With reference to the courtship experiments, 
Sevenster suggests that the fish are conditioned to 
react to the biting rod as a “dummy female” and re- 
spond to it with courtship behaviors which are in- 
compatible with the biting behavior required. Alter- 
natively, swimming through the ring is suggested to 
be compatible with the elicited approach responses 
associated with courtship and aggression; hence this 
response is performed at relatively high rates. It 
would, however, be interesting to know if the swim- 
ming response did reveal the topography appropriate 
to the reinforcement. Presumably, the way a fish 
swims through a ring would differ depending upon 
his intentions to fight or to copulate. 

Up to this point, the results of Sevenster’s experi- 
ments and his analysis are completely compatible with 
Moore’s analysis and results which were developed in 
the context of autoshaping. In addition to the Pav- 
lovian mechanism, Sevenster suggests that there may 
be a mutually inhibitory interaction between the 
motivational system- which controls courtship and the 
motivational system which controls aggressive behav- 
ior. Without entering the semantic jungle which at- 
tempts to distinguish between associative and non- 
associative effects, it is suggested that the data 
considered thus far do not seem to justify the intro- 
duction of hypothetical competing “motivational 
states.” It should be difficult enough to determine if 
two observable responses are “compatible” or “in- 
compatible’ using the relatively parsimonious Pav- 
lovian mechanism outlined by Moore. 

In summary, there is a growing amount of evidence 
to indicate that not all response-reinforcer combina- 
tions are equally effective in developing control over 
instrumental behavior. I have attempted to isolate 
and criticize several variations on the general theme 
that the organism brings certain “biological predispo- 
sitions’ into the instrumental learning situation 
which either facilitate or inhibit the development of 
instrumental responding. Of these variations, I would 
suggest that Black and Young’s system constraints 
hypothesis and Moore and Sevenster’s Pavlovian hy- 
pothesis have the advantage of being vulnerable to 
relatively precise testing and have generated the most 
convincing data. 
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CONCLUDING COMMENTS 


We started the discussion with the relatively con- 
crete problem of predicting the outcome of any one of 
20 possible instrumental contingencies which might 
be arranged in the hypothetical world of our gerbil. 
We considered a number of different contemporary 
positions, all of which reject the “weak law of effect” 
as a point of departure for the analysis of reinforce- 
ment phenomena. The six positions considered were: 
(a) Premack’s differential probability position; (b) the 
response deprivation hypothesis; (c) the optimal- 
duration model; (d) the functional relevance hypothe- 
sis; (e) the system constraints hypothesis; and (f) the 
Pavlovian hypothesis. ‘The last three positions empha- 
size the importance of the biological predispositions 
of particular species in attempting to predict the 
effectiveness of particular response-reinforcer combi- 
nations. The first three positions have, in general, ig- 
nored such biological constraints and concentrated 
upon developing a set of predictive rules which are 
implicitly assumed to hold for all species and response- 
reinforcer combinations. 

Each of the above approaches offers a viable alter- 
native to the weak law of effect as a point of de- 
parture in our analysis of the reinforcement process. 
They are not, however, without their problems. ‘The 
major problem with Premack’s differential probability 
position and with the response deprivation hypothesis 
is that much of the research surrounding these posi- 
tions has failed to include control conditions to deter- 
mine if the changes in instrumental behavior can be 
produced by changing the distribution and/or amount 
of contingent responding without a contingency in 
effect. It is possible that many of the increases in in- 
strumental responding that have been interpreted as 
evidence for a reinforcement process in the context of 
the Premackian methodology would have occurred in 
the absence of the reinforcement operation, 1.€., an 
instrumental response requirement. 

As a viable alternative to both the response depri- 
vation hypothesis and Premack’s differential proba- 
bility rules the optimal-duration model also remains 
to be investigated before either of the former positions 
can be unequivocally accepted or rejected. 

The three positions which have emphasized the 
role of biological constraints which operate upon 
particular response-reinforcer combinations also suffer 
some serious problems. A number of specific criticisms 
were offered with reference to specific experimenta- 
tion in this area in the preceding discussion. ‘These 
criticisms are, however, overshadowed by the general 
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lack of precise definitions and testability which is 
characteristic of many theoretical notions in this area. 
Before one can test predictions made by the functional 
relevance hypothesis, for example, it is necessary to 
have a precise definition of the term functional rele- 
vance which can be translated in experimental opera- 
tions. ‘There is no doubt that the three positions 
which have emphasized the importance of biological 
constraints directly question the generality of alterna- 
tive theoretical accounts of reinforcement such as 
Premack’s differential probability rules and the re- 
sponse deprivation hypothesis. They imply, for exam- 
ple, that the response deprivation rule will not apply 
to particular cases. In some instances substantial re- 
ductions in the amount of contingent responding will 
not increase instrumental responding, while in oth- 
ers an increase in instrumental responding might be 
observed without a reduction in contingent respond- 
ing. In order, however, to pit the response deprivation 
hypothesis against such positions as the functional 
relevance hypothesis, it will be necessary to have some 
method of determining if particular response- 
reinforcer combinations are “functionally relevant” 
prior to conducting the experiment. 

It seems, at present, reasonable to assume that the- 
oretical schemes such as Premack’s probability rules, 
the response deprivation hypothesis, and the optimal- 
duration model will have to be modified to account 
for the biological predispositions which are being dis- 
cussed by theorists such as Shettleworth, Moore, and 
Sevenster. It will be helpful, when these two con- 
temporary approaches to reinforcement are merged in 
a theoretical scheme, if both approaches are precisely 
enough formulated to produce testable hypotheses, 
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Schedule-Induced Behavior’ 


INTRODUCTION 


In most operant conditioning experiments, a single 
aspect of behavior such as a key peck or a lever press 
is selected as the instrumental response. Although a 
reinforcer will not ordinarily follow every instance of 
the response, it follows some, usually immediately, and 
does not occur other times. There are departures 
from this rule (such as delay of reward procedures, 
mixed classical and operant procedures, and some 
conjunctive schedules}, but it is so common, and seems 
so close to the “natural” contingencies of the animal's 
wild environment that it has become the norm. Yet it 
embodies a number of arbitrary features: the single 
instrumental response, the fixed short delay between 
response and reinforcer, and the absence of reinforce- 
ment in the absence of the response. ‘The term 
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“response-contingent reinforcement” usually embraces 
all three of these features. 

Because of the apparent naturalness of the response: 
contingent procedure, it has held the hmelight over 
the years. A mass of experimental literature has devel- 
oped describing results that are in many cases difheult 
to interpret. The ubiquitous [eedback loop between 
behavior and its consequences makec it very hard to 
discern the behavieral mechanisms that allow the 
combined animalschedule system to settle down into 
the classical “‘schedtile performances.” In any response- 
contingent procedure. the pattern of responding nec- 
essarily influences not only the correlation between 
responding and reinforcement, but also the correlation 
between reinforcement, and temporal and stimulus 
variables. For example, if an animal on a fixed-interval 
schedule responds only sporadically, the reinforce- 
ments may occur at variable rather than fixed inter- 
vals. Yet it is these temporal correlations (termed by 
Zeiler “indirect” variables, see Chapter 8) that are 
most important in determining the final pattern of 
performance. If the way they act is to be understood, 
the response contingency is an unnecessary complica- 
tion. Consequently this chapter is devoted primarily 
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to behavior induced by response-independent (clas- 
sical conditioning) procedures, and to those aspects 
of response-contingent schedules that are not directly 
dependent on the response contingency. 

The term schedule-induced does not yet have a 
widely accepted definition. However, it is clear that 
when an animal is exposed to a schedule of periodic 
food or electric shock, some activities are facilitated 
and others are reduced by this operation. Only those 
activities that are facilitated by the schedule (by com- 
parison with a pre- and post-schedule baseline when 
no food or shock is delivered) will be termed induced 
behaviors. The term facultative behavior has been 
coined to refer to activities that occur on schedules 
but do not appear to be directly affected by schedule 
factors (p. 135). 

Schwartz and Gamzu, in Chapter 3 of this volume, 
discuss the problem of terminal responses, ie., in- 
duced behavior that emerges in the presence of, or is 
directed toward, stimuli that are highly predictive of 
food or some other positive reinforcer; and Hutchin- 
son in Chapter 14 deals with behavior induced by 
schedules of aversive events. This chapter is mainly 
concerned with interim activities on schedules of posi- 
tive reinforcement; that is, induced behaviors which 
occur at times when a reinforcer is unlikely to be 
delivered. Terminal responses are discussed only as 
much as is necessary to present a comprehensive pic- 
ture of the ways in which induced and facultative 
behaviors interact under the influence of schedule 
factors. 

The first part of the chapter will be concerned with 
the factors that determine what induced activities will 
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occur, where (in relation to time and stimuli) they 
occur, and how much (with what ‘strength’) they oc- 
cur. The second part deals with the temporal and 
sequential constraints that underlie induced behavior 
sequences. 


BEHAVIOR INDUCED BY 
PERIODIC FOOD 


The simplest case to consider is the periodic presen- 
tation of “free” food to a hungry rat or pigeon (FT: 
fixed-time reinforcement). At first, the animal is likely 
to spend much of its time between food deliveries 
exploring around the food site. However, within a few 
sessions this behavior drops out and is replaced by a 
regular sequence of activities within each interval. 
Each activity becomes increasingly well defined and 
the sequence as a whole becomes increasingly stereo- 
typed as training proceeds. Figure 1 shows the patterns 
of behavior developed by a pigeon and a rat under 
broadly similar conditions of periodic food delivery. 
Both animals were hungry. The pigeon was in a bare 
enclosure and received 3-sec access to food every 12 
sec. The rat was in a hexagonal enclosure that allowed 
drinking and running in a wheel, as well as other 
activities, and received a food pellet every 30 sec. 
Despite these differences the general features of the 
behavior are similar. Activities occur in sequence, 
with some (interim activities) typically occurring early 
in the interval (facing the window wall and wing flap- 
ping for the pigeon; drinking and running for the 
rat). A single terminal response (pecking for the 
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pigeon; “food anticipation” for the rat) increases in 
frequency up until the end of the interval. First, I 
discuss the types of behavior that have been observed 
under conditions of periodic food delivery. In later 
sections I consider the sequential properties of in- 
duced behavior and the possible mechanisms of in- 
teraction that these imply. 


Terminal Responses 


In the above examples, time is the variable that 
signals the imminence of food. This is not necessary. 
Indeed the most popular procedure for studying in- 
duced terminal responses (autoshaping: Brown & 
Jenkins, 1968) follows the standard classical-condition- 
ing paradigm: food is intermittently presented, usu- 
ally at variable intervals, each delivery being signalled 
by a brief (5-10 sec) stimulus. The conclusions of this 
work can be summarized quite briefly (see Ch. 3 for 
additional references and a fuller account). (1) Some 
behavior will be induced in the presence of the stimu- 
lus that signals food. (2) It seems to be necessary that 
the stimulus predict food, 1.e., precede food more re- 
liably than any other stimulus precedes food. Simple 
pairing is not enough by itself (Bilbrey & Winokur, 
1973; Gamzu & Williams, 1971, 1973; Rescorla, 1967). 
(3) The type of induced behavior depends on a num- 
ber of factors. (a) The type and strength of the sig- 
nalled reinforcer (food, water, shock, sex, etc., cf. 
Jenkins & Moore, 1973). Often the behavior resembles 
the consummatory response usually made to the rein- 
forcer. (b) The nature of the signal stimulus, eg, 
whether or not it can be manipulated, its location in 
relation to the reinforcer site, whether it can be 
sensed without the animal orienting towards it, its 
intensity, and its biological “relevance” for the sig- 
naled reinforcer. (c) Past history, ¢.g., food-related 
responses acquired in an instrumental situation may 
reappear in an autoshaping situation. (d) How good 4 
predictor the signal stimulus is. For example, induced 
pecking is more likely to develop and is stronger if the 
signal stimulus is short than if it is long (Innis & 
Keehn, personal communication: Ricci, 197 3} or if the 
inter-stimulus (inter-trial) interval is long rather than 
short (Groves & Brownstein, 1973; ‘Terrace, Gibbon, 
Farrell, & Baldock, 1975; see also Wilton & Clements, 
1971). The induced behavior is more likely to resem- 
ble the consummatory response, and to be more 
vigorous, the better the relative proximity of the 
sional stimulus to the signalled reinforcer (Jenkins, 
1970; Hearst & Jenkins, 1974; Staddon & Simmelhag, 
1971). 

As the chapter by Schwartz and Gamzu in this 
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volume attests, there is still considerable controversy 
about the correct explanation for the terminal re- 
sponses which develop on response-independent sched- 
ules. A confusing factor is the misleading opposition 
between “Pavlovian” and “operant” views, as if these 
two classes were mutually exclusive. Historically this 
opposition is in part traceable to Skinner’s (1948) par- 
ticular operant account. His “adventitious reinforce- 
ment” hypothesis is that a terminal response occurs 
first for unspecified reasons, is accidentally contigu- 
ous with food delivery, is strengthened thereby, and is 
thus more likely to occur again. This cycle repeats 
and the result is persistent, stereotyped “supersti- 
tious” behavior. This view owes nothing to Pavlovian 
processes and rests on a contiguity view of the action 
of reinforcers for which the evidence is weak. More- 
over, its lack of any quantitative content robs it of 
predictive power: How often must a given response 
be contiguous with reinforcement to become fully 
conditioned? How contiguous must it be? Are all 
responses the same in these respects? Without good 
answers to these questions, the hypothesis is essentially 
untestable since a response which is increasing in fre- 
quency, for whatever reason, is more likely to occur 
in close proximity to reinforcement and thus to fulfill 
Skinner’s condition for its further increase. 

There are several versions of the Pavlovian view of 
terminal responses, Perhaps the simplest is that any 
situation which predicts food is likely to induce food- 
related behavior (cf. Moore, 1973). Hearst and Jenkins 
have developed a refinement of this view to deal with 
the directed nature of much induced activity (sign 
tracking: Hearst & Jenkins, 1974). Yet another view 
points to the effect of contingency Strength (predic. 
tiveness) on the range ef behavioral variation (Stad- 
don, 1976): Stimuli that are good predictors of food, 
for example, reliably producé stérédtyped behavibr. 
This restriction of variability is taken as the funda- 
mental property of strong contingencies. 


ABVENTITIOUS REINFORCEMENT 


Induced behavior is a phenomenon worthy of inde- 
pendent study. It cannot convincingly be dismissed as 
a curious illustration of learning principles better 
studied directly in other ways. ‘The latter position 1s 
most strongly represented by Skinner’s adventitious 
reinforcement hypothesis. The arguments against this 
view have been elaborated elsewhere (e.g., Gamzu & 
Schwartz, 1973; Rachlin & Baum, 1972; Staddon, 
1972b, 1976; Staddon & Simmelhag, 1971) and need 
only be summarized here: (a) During the develop- 
ment of induced terminal responses a response such 
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as the animal’s putting its head into the food aperture 
may predominate for a while, only to be supplanted 
by another response, such as pecking, despite consis- 
tent initial pairing of the first response with food. 
Adventitious reinforcement cannot account either for 
the decline in the first activity or the appearance of 
the second. (b) Terminal responses such as pecking in 
a food situation are quite resistant to response contin- 
gencies that make food delivery less likely if the re- 
sponse occurs (negative automaintenance or omission 
training: Williams & Williams, 1969). A response that 
occurs in spite of a negative contingency is unlikely 
to require a positive one for its maintenance. (c) Nega- 
tive contingencies do have some suppressive effect on 
a response such as pecking. However, much of the 
effect is attributable to effects of the contingency on 
the frequency and pattern of food delivery, i.e., on 
temporal and stimulus (not response) contingencies. 
(d) There is a logical problem in attributing the 
maintenance of a response to accidental conjunctions 
between it and food delivery. This problem is not 
overcome by demonstrating that the imposition of a 
negative contingency reduces the level of the behavior, 
even if the reduction is below the level that would be 
maintained by yoked response-independent food deliv- 
ery. Showing that a response is sensitive to a real 
negative contingency does not force the conclusion 
that its prior occurrence was owing to an accidental 
positive one. (c) Occasional response-independent food 
deliveries, superimposed on a baseline of responding 
maintained by response-dependent food delivery. often 
result in suppression of the instrumental response, 
even though the absolute number of response-food 
conjunctions is increased by this operation (Rachlin 
% Baum, 1972). The addition of “free” reinforcements 
makes the instrumental response less predictive of the 
reinforcer. Hence these results support the conclusion 
from experiments on response-independent procedures 
that a schedule is effective in modifying behavior 
only to the extent that it arranges a predictive rela- 
tion (i.e, a real contingency) between an event (a 
stimulus or a response) and the occurrence of a rein- 
forcer (Rescorla, 1967; Rescorla & Wagner, 1972). 
“Prediction” of reinforcer A by response B means, 
in this context, that p(A]B) is greater than p(A|B) or, 
in temporal terms, that B precedes A more closely than 
A is preceded by any other response (see Gibbon, 
Berryman, & Thompson, 1974, for a careful discussion 
of quantitative measurement of contingency strength). 
The adventitious reinforcement hypothesis, of course, 
attends only to positive pairings, ie., to p(A[B). 
Moreover, if, in the ideal case, p(B) approaches zero 
(the animal does nothing but B), B cannot really be 
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said to predict A, since one of the terms of the rela- 
tion that defines predictiveness, p(A|B), has vanished. 
Thus this hypothesis implies a more primitive view of 
the concept of contingency than can be justified by 
recent experimental work. 

The adventitious reinforcement hypothesis arose 
from a tacit assumption that the effects of response- 
dependent reinforcement are somehow more funda- 
mental than those of response-independent reinforce- 
ment. Skinner explained the response-independent 
case by means of an account derived from experiments 
on response-dependent reinforcement. Even if he 1s 
right in believing that the two cases share common 
mechanisms, the proper translation may be in the 
opposite direction. Perhaps mechanisms derived from 
a study of response-independent procedures can be 
applied to explain the effects of response-dependent 
reinforcement. This is not a new idea. However, the 
failure of the adventitious reinforcement notion, and 
recent advances in our understanding of the concept 
of contingency, make it once again a viable one (cf. 
Schwartz & Gamzu, Chapter 3 in this volume; Stad- 
don, 1976). 


Interim Activities 


On periodic food schedules a variety of activities 
occur at times when food delivery cannot occur, for 
example, early in the interval on fixed-time schedules. 
Some are directed at objects in the environment (e.g., 
pecking in birds, chewing, drinking, or pawing by 
rats), others have no obvious referent (e.g., head bob- 
bing, beak movement, pacing). Directed activities 
come under the control of their own “incentive stim- 
uli” (Bindra, 1972). All induced activities appear to 
depend on motivational variables. ‘Two kinds of moti- 
vational variable are important: variables related to 
the scheduled reinforcer (e.g., things that affect hun- 
eer, if food is scheduled); and variables related to the 
particular activity (e.g., to thirst, for induced drink- 
ing). 

Induced behaviors are not all affected in the same 
way by schedule variables. Hence there is much need 
for a “natural history” of induced behaviors. More 
purely descriptive work needs to be done to map out 
the sequences of behavior that occur with a variety of 
combinations of species, schedule, reinforcer, and sup- 
porting environment. In the absence of the informa- 
tion such studies could give, generalizations about 
underlying mechanisms must be tentative. Only one 
interim activity, schedule-induced drinking, has been 
studied in anything like the necessary depth (Falk, 
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1969, 1971). Running (Levitsky & Collier, 1968; Segal, 
1969), and schedule-induced aggression (Azrin, Hutch- 
ison, & Hake, 1966; Flory, 1969; Richards & Rilling, 
1972) have also received some attention. Most of this 
work has been done with rats. The following discus- 
sion, therefore, rests more heavily on these behaviors, 
and particularly on studies of induced drinking, than 
is perhaps desirable. 


INDUCED DRINKING 


Falk (1961) was the first to draw attention to the 
curious fact that hungry rats responding on a schedule 
of intermittent food reinforcement will ingest large 
quantities of water (polydipsia). This occurs even 
though the rats are not deprived of water, so that 
their total intake may be several times that necessary 
to maintain water balance. Since Falk’s original 
paper, he and others have demonstrated induced 
drinking in numerous species (squirrel monkeys, 
chimpanzees, pigeons and doves to some degree), and 
on a variety of intermittent food schedules (see Falk, 
1969 and 1971, for reviews). 

Falk’s original emphasis was on the excessive 
(polydipsic) aspect of schedule-induced drinking. 
Hence much of the initial research was an attempt to 
reconcile the behavior with known regulatory mecha- 
nisms. As is by now well known, none of these at- 
tempts was wholly successful, Neither central effects, 
in the form of an altered water balance caused by 
intermittent food, nor peripheral regulatory mecha- 
nisms (the “dry mouth” theory) seem adequate to 
explain schedule-induced drinking. It would be rash 
to dismiss this line of work as unprofitable, More 
recent efforts of this sort (for example, explorations 
of the link between temperature and water regulation 
systems, é.g., Carlisle, 1973) may yet uncovér a regula- 
tory basis for the effect. However. the relative lack of 
progress along physiological lines revives interest in 
the behavioral determinants of the phenomenon. 
These are the focus of the present account. 

The most striking thing about schedule-induced 
polydipsia is that the total amount of water drunk 
each day 1s so much greater than normal. ‘This ac- 
counts for Falk’s original emphasis on total intake, as 
a function of various schedule variables. However, in 
most studies of operant behavior the total amount of 
the behavior is of much less interest than the rate at 
which it occurs, or the percentage of time that it takes 
up (see de Villiers, Chapter 9 and Dunham, Chapter 4 
in this volume). Moreover, the obvious adaptiveness 
of operant behavior in general has long stood in sharp 
contrast to its maladaptiveness in particular situa- 
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tions. Some years ago, for example, Skinner (1961) 
pointed out that on certain schedules the energy ex- 
pended by the animal in responding exceeds that 
received from the food reinforcement obtained. De- 
spite these failures of energy regulation, response and 
reinforcement rates usually follow some simple func- 
tional relation. Hence total intake may not be the 
most useful measure of induced drinking. How then 
should induced behavior be measured? 


MEASUREMENT OF INDUCED ACTIVITIES 


The most important factor that determines choice 
of a particular dependent measure is the experi- 
menter’s belief about its probable cause. So long as 
induced drinking was thought to derive from an 
altered state of water balance, total amount drunk 
(over the 24 hours) was an appropriate measure, It 
now seems that this may not be the best way to look 
at it. Induced drinking may somehow be related to 
schedule variables and to the mechanisms through 
which they affect operant behavior. From this point 
of view appropriate méasures might be ingestion rate 
(ml/min), drinking rate (licks/min), or the fraction of 
time engaged in drinking. 

However, the scheduléinducéd nature of this effect 
introduces three complications (Flory, 1971; Staddon 
& Simmelhag, 1971). The first concerns the stimulus 
control of induced drinking. On fixed- and variable- 
interval schedules, drinking typically occurs just after 
food delivery, and it can easily be shown that once 
behavior has stabilized, drinking ts directly under the 
control of each eating bout: each bout of eating pro. 


duces a bout of drinking. Cince we are interested in 


‘the relation between frequency and amount of feed 


delivery and the animals tendency to drink, this con. 
trolling rélation introducés a confounding faatar, 
Firsts suppose that all che induced drinking is simply 
postsprandial with each “meal” producing 4 fixed 
amount of drinking (Lotter, Woods, & Vasselli, 1973: 
Stein, 1964). While the rate of food delivery deter- 
mines the rate of drinking in such a situation, it 
would be a mistake to conclude that this variable has 
a direct effect on the tendency to drink. (Simjlarly, the 
rate at which conditioning trials occur tells us noth- 
ing about the animal’s tendency to make the condi- 
tioned response.) 

A second confounding factor is the opportunity for 
drinking. If, to continue with the post-prandial exam- 
ple, the drinking bout produced by each food deliy- 
ery 1s of a fixed duration, scheduling food deliveries 
too frequently could reduce the time available for 
drinking and thus actually reduce the amount drunk 
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per food delivery. This would artificially limit the 
function relating food and drink rates. 

A third complication is the limit on the total 
amount of food that food-deprived animals can be 
permitted to ingest each day. This usually forces a 
reduction in the total number of daily food deliveries 
if the size of each delivery is increased. Consequently, 
the total amount of water drunk might appear to de- 
crease as food portion size is increased, even though 
the rate of drinking is directly related to portion size 
(Falk, 1967; Flory, 1971; Hawkins, Schrot, Githens, & 
Everett, 1972; Staddon & Simmelhag, 1971). These 
problems make a number of experimental results diffi- 
cult to interpret. 


EFFECT OF Foop RATE 


The two panels of Fig. 2 show data relating rate of 
drinking (m1/min, left panel; licks/min, right panel) 
to frequency of food delivery (opportunities to eat / 
min, here referred to as food rate), replotted from 
studies by Flory (1971) and Falk (1969), These re- 
sponse rate vs food rate functions will be referred 
to as response functions. Food was delivered on fixed- 
interval schedules of different values, and the data 
represent stable performance. Similar results have 
been found by Hawkins et al. (1972) with fixed and 
variable-time food schedules, and by Segal, Oden, and 
Deadwyler (1965) and Staddon and Ayres (unpub- 
lished) with fixed-time schedules. As others have 
shown, the presence or absence of a response contin- 
gency makes little difference to the amount and tem- 
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Fig. 2. Data replotted from 
Flory (1971), and Falk (1969) 
relating measures of drinking 
(licks/min, ml. drunk/min) to 
food rate (pellet deliveries /min). 


poral placement of schedule-induced drinking (Burks, 
1970; Segal et al., 1965; Wayner & Greenberg, 1973). 

Three aspects of these data are important for 
future discussion: (a) The functions are all mo- 
notonically increasing, with higher food rates associ- 
ated with higher rates of drinking, until food deliver- 
les are spaced very close together indeed. The functions 
for amount drunk in the left hand panel turn down 
only when food deliveries occur once every four sec- 
onds or oftener. (b) The functions for lick rate and 
rate of ingestion are generally similar in form, imply- 
ing that an approximately fixed amount of water is 
ingested with each lick (see Figure 3). (c) Flory’s data 
show that at a given food rate, two pellets per food 
delivery induce more drinking than one pellet per 
delivery. Moreover, this difference in terms of lick 
rate is greater when food delivery is infrequent. Since 
the ordinate is logarithmic, this means that the pro- 
portionate (but not the absolute) increase in lick rate 
is greater at low food rates. 


HYPOTHESES TO EXPLAIN 
SCHEDULE-INDUCED DRINKING 


Four simple behavioral hypotheses have been pro- 
posed, explicitly or implicitly, as explanations for 
schedule-induced drinking. These are: (a) the post- 
prandial hypothesis, (b) the opportunity hypothesis, 
(c) the adventitious reinforcement hypothesis, and (d) 
the motivation hypothesis. These are discussed next, 
followed by an account of running during food sched- 
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ules and a summary of the relations between running 
and drinking. 


Post-Prandial Hypothesis. The simplest hypothe- 
sis is that the drinking is simply normal post-prandial 
drinking, so that the more “meals” or “bites” the rat 
takes, the more he drinks (Kissileff, 1969; Lotter et al., 
1973; Stein, 1964). This idea implies simple propor- 
tionality between the rate of drinking and the rate of 
food delivery up to a maximum when the animal 
drinks all the time except when he is eating. Formally 
it implies a relation of the form D = K,R + Ks, where 
D is the rate of licking (licks/min; or water ingestion, 
ml/min), R is the frequency of opportunities to eat 
(food rate), K, is a constant representing the size of 
each post-prandial drinking bout, and K, is the rate 
of drinking in the absence of the food schedule. 

The empirical functions in Figure 2 are not com- 
patible with this equation because they all show con- 
siderable negative acceleration (in linear coordinates 
as well as the log-log ones of Figure 2) over much of 
their range. The post-prandial hypothesis can never- 
theless be applied to these functions if they are ap- 
proximated by two line segments, one with a steep 
slope starting at the origin (K, — 0) followed by a 
shallower segment after a break point in the vicinity 
of a food rate of one pellet every two min. However, 
these two segments suggest two contradictory inter- 
pretations. ‘The steep segment from the origin is per- 
fectly consistent with the post-prandial view with 
drinking rate proportional to food rate. However, the 
shallow segment then represents some kind of suppres- 
sion of drinking at high food rates, perhaps due to 
restricted opportunity to drink. On the other hand, if 
the shallow segment is considered to represent the 
post-prandial view, the y-intercept, K,, ig much ereater 
than the rate of drinking in the absence of the food 
schedule (which will usually be close to zero since the 
animals have unlimited access to water in their home 
cages). Hence Ky must be interpreted as some kind of 
“inducing” effect of the food schedule. 

In addition to its incompatibility with the func- 
tions in Figure 2, there are three other difficulties 
with the post-prandial hypothesis. (1) Schedule- 
induced drinking usually takes a few sessions to 
develop (Hawkins et al., 1972; Reynierse & Spanier, 
1968; Staddon & Ayres, 197 5). Yet post-prandial drink- 
ing 1s presumably well developed in normal adult 
rats. If drinking on schedules of food delivery is sim- 
ply post-prandial drinking, there seems to be no rea- 
son why it should not occur from the start. (2) Drink- 
ing on periodic food schedules is not always restricted 
to the period just after food delivery. Its location 
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within the interfood interval depends on interval 
length—the longer the interval, the later the onset of 
drinking (Segal et al., 1965)—and if, after the rat has 
learned to drink in the post-food period, access to the 
water bottle is restricted to a time late in the interval, 
drinking eventually recovers to essentially full strength 
(Flory & O’Boyle, 1972; Gilbert, 1974). (3) Drinking 
and eating in rats with free access to food and water 
are linked, but not in the way required by the post- 
prandial hypothesis. In a careful series of studies, 
Kissileff (1969) showed that drinking occurs both just 
after and just before eating bouts. 

Lotter et al. (1973) have recently revived the post- 
prandial hypothesis in an attempt to explain schedule- 
related drinking as an “artifact” of the small “meal” 
size imposed by intermittent schedules. Much of their 
argument rests on demonstrations that increasing re- 
ward size reduced overall rate of drinking during 
single test sessions. These tests are not valid because 
the mnereased drinking that is associated with increases 
in amount of food takes time to develop, as the an- 
imals must have time to learn that the reward size 
(incentive value) in the situation has increased (c.g., 
Hawkins et al., 1972). It is known that induced drink- 
ing, once developed, is controlled by each meal, as a 
discriminative stimulus (e.g., Staddon & Ayres, 1975). 
Hence a reduction in meal rate, such as might occur 
following an increase in meal size, would automat= 
ically yield a reduction in rate of drinking, at least at 
first, It seems likely that a reduction in meal rate did 
occur, since although Lotter et al. do not report the 
actual (as opposed to the scheduled) intermeal inter- 
val in their one hour test sessions, the total number of 
pellets consumed during tests was disproportionately 
small. For example, in their third experiment reward 
size was increased from one to 12 pellets. Yet the 
number of pellets consumed per hour increased from 
70.5 to only 218.5. Henee the number of meals must 
have dropped. A reduction in rate of drinking under 
these conditions savs something about the stimulus 
control of drinking once it has developed, but nothing 
about the reasons for its development. 

On periodic schedules, the “post-reinforcement 
pause” is controlled by food delivery as a discrimina- 
tive stimulus. Since the pause’ is generally taken up 
with interim activities, their duration must be simi- 
larly controlled by food. When the duration or amount 
of food delivery is increased (as occurred during test 
sessions in the Lotter et al. experiment), post-reinforce- 
ment pause generally increases, although the increase 
may be transient (Jensen & Fallon, 1973; Staddon, 
1970, 1974). Thus the first effect of increasing “meal” 
size should be an increase in the amount of drinking 
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per meal. In a careful reanalysis of the test session 
data of Lotter et al., Millenson (1975) has recently 
shown this to be the case. ‘Thus, their data are con- 
sistent with what is known about schedule-induced 
drinking and temporal control on periodic schedules 
but not, unfortunately, with their conclusion. 


Opportunity Hypothesis. Falk (1969) originally 
suggested that schedule-induced drinking is related to 
the intermittency of food delivery. In its simplest 
form this view suggests that the rate of drinking is 
more or less constant with the animal drinking for a 
fixed fraction of the time available between pellets. It 
implies that the function relating amount drunk per 
interval to size of interval should be monotonically 
increasing; whereas this function is actually bitonic 
(Flory, 1971). Hence this view is not acceptable. 


Adventitious Reinforcement Hypothesis. This 
hardy perennial has been applied to interim drinking 
as well as to terminal “superstitious” responses (e.g., 
Clark, 1962; Moran, 1974; Segal, 1965). All the objec- 
tions to it raised earlier (p. 127) also apply here. In 
addition, induced drinking rarely occurs contiguously 
with food delivery (Segal, 1969 is an exception), and 
is little affected by lick-contingent delays of food de- 
livery unless these are so extreme as substantially to 
reduce food rate. As Figure 2 shows, over most of the 
range a reduction in food rate results in a decrease in 
drinking rate. Unless food rate is controlled, therefore, 
suppressive effects of a negative contingency cannot be 
interpreted as acting directly on the tendency to 
drink. Induced drinking develops relatively slowly, 
in step with the development of temporal discrimina- 
tion on periodic schedules (Staddon & Ayres, 1975). 
Hence activities other than drinking are at first con- 
tiguous with food delivery. The adventitious rein- 
forcement view neither explains why these drop out, 
nor why drinking (rather than some other activity) 
supplants them in almost every individual rat. When 
the water bottle is made available for only a brief 
period during the inter-food interval (Flory & O’ Boyle, 
1972), drinking still develops, even though lick-food 
contiguities are specifically excluded and the water 
bottle itself is a stimulus signaling the absence of 
food (S4). 


Motivation Hypothesis. The steep fall-off in the 
functions of Figure 2 at low food rates, the greater 
drinking with 2-pellet versus 1-pellet food deliveries, 
the progressive development of schedule-induced 
drinking, in step with food-anticipation (Reynierse & 
Spanier, 1968; Staddon & Ayres, 1975), and the inverse 
relation between schedule-induced drinking and body 
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weight (Bowen, 1972; Falk, 1969), all point to an effect 
of food motivation, hunger, and incentive, on induced 
drinking. The more motivated the animal (hunger: 
deprivation, body weight) and the more motivating 
the situation (incentive: frequency, amount, and type 
of food) the greater the tendency to drink. The in- 
centive motivation factor seems to follow the princi- 
ples now being codified as the quantitative law of 
effect (see de Villiers, Chapter 9). Thus drinking rate 
appears to be directly related to amount of food (Fig- 
ure 2), palatability of food (Falk, 1971), and frequency 
of food (Figure 2). Jacquet (1972) has studied the in- 
teractions between two components of a multiple VI 
VI schedule in terms of the relative and absolute rates 
of both bar pressing and induced drinking in rats. She 
found that as the relative rate of food reinforcement 
in one component increased (owing to a decrease in 
the rate of reinforcement in the other component), the 
absolute rate of drinking tended to increase. This is 
positive behavioral contrast, an effect often found 
with food-reinforced behaviors and generally attrib- 
uted to a change in stimulus contingencies (see Chap- 
ter 3). 

Thus, the evidence appears to support the view 
that induced drinking is related to food motivation, 
which is, in turn, affected both by internal factors 
(deprivation) and external factors (incentive). In a 
later section I discuss possible mechanisms of interac- 
tion between the motivational states of hunger and 
thirst that might underlie the empirical relation be- 
tween induced drinking and food motivation. 


INDUCTION: TERMINAL AND 
INTERIM ACTIVITIES 


Since both interim activities and the terminal re- 
sponse tend to increase with food rate, and since 
neither can increase without limit, it seems likely that 
these two classes of activity are in competition. This 
section presents the evidence for such competition and 
looks at some of its effects. 

Functions relating food-reinforcement rate to rate 
of pecking or lever pressing (response functions) have 
for some years been a standard way of representing 
the effects of schedules on behavior. The typical 
schedule has been variable-, rather than fixed-interval, 
and the response an instrumental rather than an in- 
duced one. However, there is every reason to suppose 
that terminal responses on both response-contingent 
and response-independent schedules are related in 
similar ways to reinforcement and motivational vari- 
ables. Manipulations such as a shift in relative rein- 
forcement frequency produce similar contrast effects 
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in both (e.g., Gamzu & Schwartz, 1973; Redford & 
Perkins, 1974), and the response contingency seems to 
act more to select one terminal response over others 
than to affect the “strength” of the response once 
selected (Staddon & Simmelhag, 1971). This is not to 
say that all terminal responses have similar properties 
—there is by now ample evidence that they do not 
(e.g., Hinde & Stevenson-Hinde, 1973)—just that there 
seems to be no strong effect of response contingency 
per se on the properties of a given terminal behavior 
once it is established. 

An analysis in terms of terminal and interim peri- 
ods associated with different regions of post-food time 
does not seem as applicable to variable-interval as to 
fixed-interval schedules. Food does not seem to have 
the same kind of discriminative-stimulus status on 
variable-interval schedules as it does on fixed-interval. 
Nevertheless it can be argued that on VI, as on FI, 
time is segmented into interim and terminal periods, 
although these are not as simply related to post- 
reinforcement time (Rachlin, 1973). (The factors ac- 
counting for this difference, and for the temporal 
location of activities within the inter-food interval, 
will be taken up in the third section.) Thus it seems 
reasonable to assume that the rate of a terminal 
response such as pecking, once it is established, de- 
pends much more on variables such as food rate, 
palatability of food, size of food portion, and hunger 
than on temporal relations between the response and 
food delivery. We have just seen that these are the 
variables that determine the level of induced drink- 
ing. In a sense, therefore, (instrumental) terminal as 
well as interim responses can be regarded as schedule- 
induced behavior. 

If each terminal or interim response (eg., lick. 
peck, bar press) is assumed to require an approxi- 
mately constant time, then any increase in rate of 
responding with food rate implies that the responding 
will take up an increasing fraction of the interfood 
interval as interval length decreases.! As we have al- 
ready seen, the rates of both induced drinking and in- 


1'This can easily be shown algebraically. For an interim 
activity A, where each response takes up a fixed time t,, then if 
A takes up a fixed fraction, k,, of the interfood interval, T: 
N,t, =k,T, where N, is the average number of occurrences of 
A per interval. The response function for A is then given by 
N,/T=k,/t,, which is a constant. Hence any growth in the 
rate of A with food rate means that A takes up an increasing 
fraction of the interval as food rate increases. If the rate of A is 
proportional to food rate, N,/T =k,/T, then the fraction of the 
interval taken up by A is equal to t,k,/T, i.e., also proportional 
to food rate. Similar calculations can be carried out for the 
terminal response, B, and it is clear that when t,k, + t,k, = T 
(where k, and k, are the constants of proportionality) the entire 
interval is taken up. 
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strumental responding increase with food rate. Hence 
it is reasonable to postulate the eventual development 
of competition between drinking and the terminal re- 
sponse at high food rates. Moreover, there is some 
evidence that induced drinking (at least) tends to 
grow with food rate rather faster than terminal re- 
sponding. For example, the 1-pellet lick rate function 
in Figure 2 has an initial slope greater than one. Yet 
instrumental (terminal) responding, which is often 
well described by Herrnstein’s equation, P = kR,/ 
(R, + R,), grows linearly at first, since when Ry, is 
small the equation reduces to P = KR, /Rg, Le., a slope 
of one in the log-log coordinates of Figure 2. ‘This 
implies that the fraction of the interval taken up by 
an interim activity such as drinking should increase 
with food rate. The eventual flattening out of the re- 
sponse functions for both interim and terminal activ- 
ities can be taken as one outcome of the competition 
between them. 


Running. Figure 3 illustrates another effect of 
competition between terminal and interim activities, 
The figure shows rates of wheel running and drink- 
ing (in licks/min and ml/min) of five female rats 
exposed to five different fixed-time schedules in a 
hexagonal apparatus that afforded access to a variety 
of activities (Staddon & Ayres, 1975 and unpublished). 
Drinking rate increases with food rate, as in Figure 2, 
and the function form for licks/min and ml/min is 
the same. However, the rate of wheel running de- 
creases as food rate increases, suggesting suppression 
by competition from drinking and the terminal re- 
sponse (which in this case was waiting in the feeder 
area accompanied by pawing and chewing at the 
feeder opening: “food anticipation”). Thus, it ap- 
pears as if running is not schedule induced, but rather 
“fits in” at times when the tendency to engage in the 
two dominant classes of activity is weak. This view 1s 
in agreement with the temporal distribution of the 
three classes of activity, with drinking occurring first 
in the interval, followed by running and then food 
anticipation (cf. Figure 1). It is also consistent with 
the finding of Staddon and Ayres (1975) that during 
acquisition, the temporal pattern of food anticipation 
and drinking developed to essentially its final form 
before much running had occurred. 

Other evidence suggesting that running is not 
schedule induced is that when food delivery is dis- 
continued (extinction), running rate and the fraction 
of total time the animal spends in the running wheel 
area increases (Staddon & Ayres, 1975, and unpub- 
lished). Even if running is required for the produc- 
tion of food, its overall frequency may not be in- 
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Fig. $. Drinking rate (licks min and ml] min and running 
rate (turns of a 27cm dia running wheel/min) vs. food rate 
on various fixed time food schedules for four female rats (Stad- 
don and Ayres, unpublished data.) Points show individual rats, 


lines are means. Each rat was exposed to four of the five 
food rates; hence cach point is the average of four animals, 


creased, although its temporal distribution adapts to 
the schedule (Skinner & Morse, 1958). 

Levitsky and Collier (1968) have reported an in- 
crease in running under schedule conditions as op- 
posed to extinction. However, the rat was entirely 
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Fig. 4. Schematic representation of the relation between inter- 
food interval and the proportion of the interval taken up by 
interim, “facultative,” and terminal behaviors. 


enclosed in a running wheel in their apparatus, so 
that running may have been confounded with general 
activity, which, as Killeen (1975) has shown, increases 
with food rate. Similarly, Staddon and Ayres (1975) 
report a decrease in overall activity, measured as area 
changes per min, in extinction, but wheel running in 
their apparatus increased in extinction. Smith and 
Clark (1974), using an apparatus similar to that of 
Levitsky and Collier and a multiple spaced-responding 
schedule, obtained mixed results: one rat showed less 
running at low food rates, two others showed a bitonic 
relation. Thus, apart from exceptions that may reflect 
peculiarities of some kinds of running-wheel ap- 
paratus, it appears that running is suppressed by a 
food schedule. 

‘The picture that emerges from this account 1s sum- 
marized in Figure 4. The figure shows the inter-food 
interval divided into three periods: an interim? pce- 
riod, devoted to drinking (if water is available) and 
perhaps other activities, such as aggression (see be- 
low); a terminal period, devoted to food anticipation 
or the imstrumental response; and a third period, 


2 Until this point the term “interim” has been applied to all 
those activities which precede the terminal response within the 
interfood interval. In Figure 4, however, only activities induced 
by the schedule, such as drinking, are so labelled; “neutral” 
activities, such as running, fall into the “facultative” category. 
There seems to be no reason to settle on either usage as defini- 
tive, providing that whenever the term is used it is clear which 
is intended. 
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when activities not induced by the schedule, faculta- 
tive activities, can occur. Running appears to be a 
facultative activity in this sense. Other possibilities 
are comfort activities, such as preening and grooming. 
The diagram has two critical features: (a) that the 
percentage of time devoted to both terminal and 
interim activities increases as interval size decreases, 
and (b) that this progressive increase is limited at 
short intervals, when the entire interval is taken up 
by the two classes of induced activities. 

The diagram reveals several uncertainties. First, 
the exact form of the area boundaries is not known. 
In particular, it is necessary to know the function 
relating overall rate of an activity such as licking or 
key pecking to the percentage of time taken up. The 
simple proportionality assumed for illustrative pur- 
poses in footnote 1 may not hold generally. Such func- 
tions are not available at present. It is also likely that 
the boundaries of the interim and terminal areas are 
not fixed, but depend on the strength of behavior in 
the “other” category. For example, in the experiment 
sy Staddon and Ayres (1975), one rat showed anom- 
alous behavior traceable to a very strong tendency to 
run. It failed to show the usual temporal distribution 
of activities within the inter-pellet interval until 
either running was prevented, or more time was made 
available for it by preventing drinking (see p. 147). 
The diagram is also restricted to periodic food sched- 
ules in which opportunities for all activities are con- 
tinuously available; it does not deal with either vari- 
able schedules or limited—availability schedules. 
Finally, the possible mechanisms underlying the dis- 
tribution of activities remain to be considered. 


INTERIM ACTIVITIES AND S4 PEepiops 


Drinking and other schedule-induced interim ac- 
tivities always seem to occur (in the steady state) at 
times, or in the presence of stimuli, that signal the 
absence of food (S4 or interim periods). On fixed- 
interval schedules, food is not available carly in the 
interval. Hence the occurrence of schedule-induced 
drinking or attack at those times is explainable. How- 
ever, food can occur at any time on variable interval 
schedules: food delivery is as likely just after food as 
it is at other times. Yet even on variable schedules, 
induced drinking still tends to be restricted to the 
period just after food delivery. How can this be ex- 
plained? 

Under free conditions, rats tend to drink just be- 
fore and just after meals (e.g., Kissileff, 1969). It seems 
likely that post-prandial drinking, at least, is main- 
tained when animals are exposed to intermittent food. 
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Hence even on a variable schedule, if water is avail- 
able it is possible that food will not be eaten in the 
period just after a previous food delivery. This period 
therefore qualifies as an interim period in terms of the 
obtained, if not the scheduled, distribution of food 
deliveries. The temporal locus of induced drinking 
on variable schedules may thus be traced to the way 
in which the animal’s initial behavior interacts with 
the properties of the schedule; by drinking after eat- 
ing, the animal can produce an obtained distribution 
of intereating times that is quite different from the 
programmed distribution. ‘The obtained distribution 
then maintains the behavior that led to it. This kind 
of “self-fulfilling” schedule-behavior interaction is not 
an uncommon occurrence 1n situations that allow for 
the expression of induced behavior (see Staddon & 
Ayres, 1975, for other examples). Any irregularity in 
the function relating probability of food to post-food 
time (i.e., any deviation from a constant probability) 
should accentuate this ‘“‘self-fulfilling” tendency, 
especially if the probability is relatively lower (not 
necessarily zero) in the immediate post-food period. 
Millenson (personal communication) has data in sup- 
port of this inference, since he finds that schedule-in- 
duced drinking is less reliably obtained on a random- 
interval (i.e., constant probability) schedule than on 
a variableinterval schedule with an arithmetic pro- 
gression of intervals (in which the probability of food 
increases with post-food time). 

This interpretation recognizes a link between cat- 
ing and subsequent drinking, although such a link 
need not be either strong or unmodifiable. For 
example, Flory and O’Boyle (1972) have shown that if 
drinking is prevented in the period just following 
food delivery, but is permitted later in the interfood 
interyal, the rat drinks almost as much as when water is 
continuously available. Gilbert (1974) has reported a 
similar result. Drinking occurs after ¢ach response on 
spaccd-responding schedules (Segal & Holloway. 1963) 
and has been reported to occur after brief stimuli on 
second-order schedules (Porter & Kenshalo, 1974: 
Rosenblith, 1970) although there are some conflicting 
results (Allen, Porter & Arazie, 1975; Porter, Arazie, 
Holbrook, Cheek & Allen, 1975). These results all 
suggest that although the link between eating and sub- 
sequent drinking is a factor in the temporal location 
of drinking in the interfood interval, it is not essential 
to its induction. 

In limited-availability procedures, presentation of 
the water bottle signals a period when food will not 
be delivered. The fact that rats drink with undimin- 
ished vigor during these periods conforms to the 
general conclusion that induced drinking is charac- 
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teristic of interim (S4) periods (Falk, 1969, 1971; Stad- 
don & Simmelhag, 1971). It also indicates that such 
periods need not be defined temporally (cf. Wiittke & 
Innis, 1972). If interim activities have something to do 
with the animal’s ability to “time,” i.e., to refrain 
from making the terminal (food-related) response at 
times when food is not available, one might expect 
them to be reduced in strength when an external cue 
is available. Unfortunately the effect of stimuli on the 
strength and locus of interim activities has not been 
systematically assessed. 


SCHEDULE-INDUCED ATTACK 


Rats show strong induced drinking on periodic 
food schedules, but pigeons and doves show weaker 
effects (Shanab & Peterson, 1969). However, several 
experiments have shown that these birds on food 
schedules will attack another bird, a stuffed model, a 
mirror, or even a color slide of a pigeon (e.g., Azrin, 
Hutchinson, & Hake, 1966; Cohen & Looney, 1973; 
Flory & Ellis, 1973). Squirrel monkeys show schedule- 
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Fig. 5. Data replotted from Cohen and Looney (1973), Cherek 
et al. (1973) and Flory (1969) relating measures of attack rate to 
food rate on various intermittent schedules (see text for details). 
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induced biting attack on a rubber hose (Hutchinson, 
Azrin, & Hunt, 1968) and rats on schedules of food 
or water reward will attack another animal (Gentry & 
Schaeffer, 1969; Thompson & Bloom, 1966). 

Figure 5 shows data replotted (in log-log coordi- 
nates) from three experiments that have studied 
schedule-induced attack in pigeons as a function of 
food rate (Cherek, Thompson, & Heistad, 1973; Cohen 
& Looney, 1973; Flory, 1969). Cherek et al. measured 
the rate at which pigeons pecked against the front of 
a transparent box containing a live target bird, on a 
response-initiated fixed-interval schedule. Flory used a 
stuffed target pigeon and a fixed-time food schedule; 
and Cohen and Looney used a mirror as the target 
and a multiple fixed-ratio 25 fixed-ratio N (N varied 
from 25 to 150) schedule. Cherek et al. and Flory both 
used a “delay” contingency which prevented food 
delivery within 15 sec of an attack response—a com- 
mon protection against “adventitious” reinforcement 
of attack by food delivery. The target bird was avail- 
able on a fixed-ratio 2 schedule for 15 sec at a time in 
the Cherek et al. study, but continuously available in 
the others. Despite these differences of target, sched- 
ule, and target availability, the attack rate data in 
Figure 5 show considerable overall agreement. Rate of 
aggressive pecking peaks at a food rate between .30 
and 1.0 per min and declines sharply at lower and 
higher rates. In linear coordinates the falloff at high 
food rates is more gradual than the falloff at low 
rates. ‘he absolute rate of target pecking in the Cohen 
and Looney study was higher than in the others be- 
cause they report “local” rate (i.e., rate in the post-food 
pause) rather than overall rate. The decline in re- 
sponding at higher food rates in their study was more 
gradual (in linear coordinates), but it is not clear 
whether this was because of the ratio food schedule, 
the mirror-image target, or other features of their 
situation. 

The general agreement among the functions in 
Figure 5 suggests that food rate (deliveries/min) is the 
determining factor in all these experiments and that 
the way in which this food rate comes about, whether 
via an interval or a ratio schedule, is much less im- 
portant. Certainly periodic food is essential to the 
attack responding, since attack declines to a negligible 
level in extinction (Cherek et al., 1973). In an explicit 
ABAB comparison of response-dependent and_re- 
sponse-independent interval schedules of food delivery, 
Cherek et al. found little consistent effect of the re- 
sponse contingency on rate of attack. There are as yet 
no data on yoked-contro] comparison of time versus 
ratio procedures, so factors in addition to food rate 
cannot be entirely excluded. 
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DRINKING AND ATTACK 


Comparison of the attack rate data in Figure 5 
with the drinking rate data in Figure 2 shows that al- 
though both types of behavior fall off drastically at 
food rates less than about 0.5 per min, attack respond- 
ing also shows a decline at higher food rates. Induced 
drinking only shows such a falloff at very high food 
rates (in excess of about one per 4 sec, Flory, 1971). 
Staddon and Ayres (unpublished) find that this de- 
cline in the average function occurs because some 
individuals fail to drink at all when food is delivered 
very frequently. When pellets were delivered every 5 
sec perhaps half the rats in the Staddon and Ayres 
study failed to drink in most intervals (cf. Segal et al., 
1965). ‘The most obvious explanation for this is that 
the time available between food pellets is simply in- 
sufficient, although since some rats continued to drink 
the limitation seems not to be the purely mechanical 
one of getting to the water bottle and back again be- 
fore the next pellet delivery—especially as rats on 
some variable schedules will drink so much that they 
postpone food delivery or fail to pick up the pellet 
when it is delivered (Clark, 1962; Falk, 1961). The 
decline in the attack rate function occurs at relatively 
low food rates and cannot be explained by any kind 
of mechanical limitation. 

Induced attack, like induced drinking, is an interim 
activity and occurs in the period just after food de- 
livery on fixed-interval and _ fixed-ratio schedules 
(Richards & Rilling, 1972). Like induced drinking, it 
occurs after each response on  spaced-responding 
(DRL) schedules (Knutson & Kleinknecht, 1970). 
However, data are lacking on its temporal position 
relative to other induced activities, such as drinking, 
preening, etc. when opportunitics for several are 
available. Comparison is made more difficult because 
most attack studies have been donc with pigeons, most 
studies of induced drinking with rats. It would also be 
useful to find a facultatiye actiyily analegeus te ran- 
ning in rats, that could be used in similar fashion to 
help clarify the interactions among induced actiyities 
in pigeons. 

The early falloff in attack rate as food rate in- 
creases beyond one delivery per 2 min suggests a 
difference in the time courses of drinking and attack. 
Perhaps there is a limit to the speed with which a 
tendency to post-eating attack can build up, no matter 
how strong the inducing factors, 1.e., the hunger and 
incentive motivation associated with the food sched- 
ule. Data on the effects of hunger and of food type 
and amount on attack rate would help to sort this out. 
For example, a direct relation between hunger and 
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attack rate on FI 2 min would suggest that the decline 
in attack rate at higher food rates reflects a time- 
course limitation, rather than a non-monotonic rela- 
tion between food motivation and tendency to attack. 


OTHER INTERIM ACTIVITIES 


Data on interim activities other than drinking, 
running, and attack are sparse (see Falk, 1971, for a 
review). Pica (eating of non-food objects) has been 
observed in rats and monkeys (e.g., Villareal, 1967); 
and pigeons, humans, and sometimes rats, show a 
variety of stereotyped motor patterns, such as pacing, 
neck stretching, wing flapping and preening or groom- 
ing (Kachanoff, Leveille, McLelland & Wagner, 1973; 
Keehn, 1972; Staddon & Simmelhag, 1971). In unpub- 
lished observations I have noticed that schedule-in- 
duced wing flapping in pigeons is a strikingly auto- 
nomous behavior, Although all the necessary tests 
have not been carried out (see pp. 145 et seq.), it 
appears as if each bout is internally timed (see pp. 
145) so that wing flapping persists even if food 1s 
made available during a bout. Pigeons will sometimes 
repeatedly miss food deliveries in this way. Gilbertson 
(personal communication) has obtained reliable in- 
duced preening by attaching a piece of solder wire to 
the pigeon’s wing as a minor irritant. There 1s also 
one report in which pigeons, trained to key peck for 
food on a fixed-ratio schedule, pecked a bolt head as 
an interim activity (Miller & Gollub. 1974). 

‘There are also few data on schedules using positive 
reinforcers other than food or water. Gilbertson (per- 
sonal communication) hag trained male pigeons to 
peck on a ratio schedule for the sight of a female. 
‘Thece birds show courtship behavior, bow-cooing and 
wing flapping, as interim activitics, as an alter-cllest 
of the sight of the female (see also Nelson. 1965: 
Savenster, 1073}, 


Induced States 


‘The regione of post-food time identified in Figure 
4 a5 “interim,” “facultative,” and “terminal,” and their 
associated behaviors are more properly considered as 
states or “moods” of the behaving animal (rather than 
simply as “behavyiors’), Lhis is because they represent 
different kinds of behavioral potential. ‘The same 
stimulus, water for example, has different effects dur- 
ing the interim period from those it has during the 
terminal period. ‘The rat drinks in the one but not in 
the other; hence his state must have changed, and this 
change can be traced to the different temporal cues 
effective during these two periods. Similarly, brief test 
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presentations of food during the interim period may 
fail to elicit eating (Konorski, 1967; Staddon, unpub- 
lished observations), at least the first couple of times. 
Eventually the animal will eat, but by then the dura- 
tion of the interim period (which is defined by the 
animal’s history of exposure to food opportunities) 
will have changed to allow for eating earlier in the 
interval, as it does when an animal is shifted from a 
fixed—to a variable-interval schedule (e.g., Innis & Stad- 
don, 1971; Rachlin, 1973). 

The states that fill up the interfood interval differ 
in their motivational properties. Not only does the rat 
drink during the interim period, but its state resem- 
bles thirst, just as the terminal state resembles hunger. 
For example, rats and monkeys will learn to press a 
lever to obtain access to water during the interim 
period (Falk, 1971). Whether “thirst”? in this sense is 
identical to “thirst” that follows water deprivation is 
hard to say. They certainly share many properties, as 
will become clear in a moment. 

Perhaps because of a preoccupation with stimulus- 
response notions of behavioral causality, the proper- 
ties even of terminal states have been little explored 
until recently. Pavlov (1927) induced such states by 
means of stimulus-reinforcer contingencies, but 
studied only a fraction of the animal’s potential be- 
havior: the “conditioned response’ of  salivation. 
Occasional anecdotes filtered out of his laboratory 
suggesting that the conditioning operations produced 
much more extensive changes than this. For example, 
Liddell (recounted in Lorenz, 1969) noticed that a dog 
released from its harness would approach and jump 
upon the metronome CS. Zener (1937) in a classic 
paper described a variety of other behaviors in the 
presence of the conditioned stimulus, suggesting that 
animals develop “expectations” about the imminence 
of food, Bolles (1972) has recently defended a revival 
of this position. Recent work on auto shaping (e.¢., 
Browne, 1973) lends it some support. 

Labelling induced states with terms such as 
“hunger” or “food expectancy” is convenient, and re- 
minds us of moods familiar from introspection. Un- 
fortunately a feeling of familiarity is not the same 
thing as exact knowledge, and may even hinder the 
search for it, Induced states can be explored in two 
main ways: (a) By looking at the effect of various test 
stimuli, when the animal is in the state, as compared 
to when it is not; (b) by looking at the effectiveness of 
various reinforcers when it is in the state as compared 
to when it is not. The first is equivalent to the method 
of transfer tests, used to discover ‘what is learned’’ in 
a learning situation. It is concerned with what might 
be termed the cognitive and perceptual properties of 
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the induced state. For example, one could see whether 
a rat on a periodic food schedule will respond in the 
interim period to stimuli that in its past have been 
associated with access to water. The second method is 
concerned with the motivational properties of the 
state, with for example, the reinforcing effectiveness of 
water during a food schedule as compared with its 
effectiveness in the absence of the schedule. 

The cognitive properties of induced interim states 
have not been adequately explored; much more has 
been done on their motivational properties. I first con- 
sider the similarities between induced drinking and 
thirst, and then the properties of interim states gen- 
erally. 


INDUCED DRINKING AND ‘THIRST 


Falk (1969, 1971) has identified several factors that 
point to a similarity between thirst and the rat’s state 
when induced drinking is observed: (a) The acquisi- 
tion of induced drinking is impaired by pre-loading 
with water; (b) rats and squirrel monkeys will learn 
to press a lever during the interim period to obtain 
access to water; (c) rats drink less if the terminal food 
reinforcer contains water; (d) rate of induced drinking 
is directly related to the palatability of the available 
liquid. ‘To these can be added the finding that water- 
deprived rats will lick a stream of air (e.g., Mendelson, 
Zielke, Slangen & Weijnen, 1972; Werner & Freed, 
1973), and this activity can be induced by food sched- 
ules in the same way as drinking. On the other hand, 
some physiological studies report differences between 
schedule-induced and “normal”’ drinking (e.g., Carlisle, 
1971, 1973). 

Water acts as a reinforcer for water-satiated rats on 
periodic food schedules, as shown by its effectiveness 
in maintaining lever pressing. Along the same lines, a 
study by Allen and Porter (personal communication) 
using a multiple FI 1 min FI 1 min food schedule, 
showed positive contrast effects with a water-reinforced 
response. Water was at first available in both com- 
ponents of the multiple schedule on a FI 0.75 sec 
schedule. Later, response rate on the water lever was 
recorded in one component as a function of whether 
or not water was available in the other. When water 
was removed in one component, response rate on the 
water lever in the other component increased (posi- 
tive contrast). Thus, the tendency to drink on 
periodic food schedules is determined both by food 
rate variables and by variables related to water avail- 
ability. Since the effectiveness of water as a reinforcer 
is presumably also related to food rate, these effects 
point to quite complex interactions between food rate 
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(the fundamental instigating variable), the tendency 
to drink, and the capacity of water to act as a rein- 
forcer. 

These data on the effects of water motivation on 
induced drinking support the earlier generalization 
that the strength of schedule-induced drinking 1s 
jointly determined by both food and water motiva- 
tion. Its sensitivity to water-motivation factors under- 
lines both its ‘state’ character and its similarity to 
normal thirst. The evidence is not sufficient to assert 
that deprivation-induced and _— schedule-induced 
“thirst”? are identical, however, and it would be sur- 
prising if they were in view of the different temporal 
properties of the two. 


‘TERMINAL-INTERIM INTERACTION 


Several lines of evidence suggest that for many 
species the interim period on periodic food schedules 
is aversive: (a) Many of the interim activities devel- 
oped by pigeons on fixed-time schedules are suggestive 
of flight: neck-stretching, hopping, wing flapping, and 
even brief hopping flights (Staddon & Simmelhag, 
1971; unpublished observations). Many of their move- 
ments resemble the intention movements made by 
wild pigeons just before they take flight (e.g., Davis, 
1973). (b) Pigeons will learn to peck a key during the 
interim period on FI to produce a timeout (house- 
lights off and different stimuli on the response keys), 
and their tendency to do so is bitonically related to 
interval value, although the peak is at about FI 4 
rather than FI 2, the value for maximum attack 
(Brown & Flory, 1972). Numerous other studies have 
shown that both rats and pigeons find interim periods 
aversive (e.8., Appel, 1963; Azrin, 1961: ‘Thompson, 
1964), (c) As we have already seen, pigeons, rats, and 
monkeys will attack other animals, objects, or repre- 
sentations during the interim period, and pigeons will 
peck a key for the opportunity to do so (Cherek et al., 
1973). 

Staddon and Simmelhag (1971) tentatively pro- 
posed that there is a reciprocal interaction between 
the terminal and interim states on periodic schedules; 
and that this mechanism serves the adaptive function 
of removing animals from food situations at times 
when food delivery is unlikely. ‘There is some evidence 
that enforced proximity to the food site enhances 
schedule-induced drinking. Clark (1962) found less 
drinking when the water bottle was moved away from 
the food tray, and Staddon and Ayres found much less 
drinking in their hexagonal apparatus (in which food 
and water sites were separated) than that reported by 
Flory (1971) and others (compare Figures 2 and 3). In- 
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duced drinking and the food-related terminal response 
are related to food motivation in a similar way; hence 
the “hungrier” the animal during the terminal period, 
the “‘thirstier” he is during the interim period. These 
data, together with physiological links between hypo- 
thalamic structures involved in eating and drinking 
(Akerman, Andersson, Fabricius, & Svensson, 1960; 
Wayner, 1970), and at least one demonstration, with 
doves, of adjunctive eating on a water schedule (Mc- 
Farland, 1965), make the idea of a reciprocal interac- 
tion between these two motivational states a plausible 
one. There are difficulties in testing the idea of com- 
plete reciprocality, however, because severely water- 
deprived animals will not eat. ‘This may underlie the 
failure of Carlisle, Shanab, and Simpson (1972) to in- 
duce eating in thirsty rats by means of a periodic 
water schedule. 

The notion that during the interim period on a 
food schedule animals are motivated in ways antag- 
onistic to food motivation explains the apparent 
aversiveness of the interim period. A thirsty animal 
might well try to escape from a food situation. Indeed, 
Pliskoff and Tolliver (1960) have shown that hungry 
rats maintained on a fixed-ratio food schedule will 
respond more on a second lever that removes them 
from the food schedule (by producing a 5-min time- 
out) when deprived of water for three days than when 
not water-deprived. There are several uncertainties, 
however. For example, even in large enclosures 
pigeons do not stray far from the response key during 
the interim period (Staddon, unpublished observa- 
tions), although careful studies of the effects of scheds 
ule variables on spatial position have not been carried 
out. Hence the induced aversivencss of the food site 
(or response key—the data do not distinguish between 
the two) must decline faster with distance than its 
attractiveness. What determines the form of such 
gradients and how might they be measured? Another 
difficulty is that most of the data on the aversiveness 
of the interim period come from pigeons, most data 
on induced drinking from rats. How legitimate is 
generalization from one species to another? 

Schedule-induced aggression poses a related prob- 
lem. Is it induced by the schedule in the same way as 
drinking? ‘The falloff in the response function at high 
food rates suggests not, but there may be other ex- 
planations for this. Or is aggression one outcome of 
the conflict between the tendencies to approach and 
retreat from the food site? ‘This kind of explanation— 
disinhibition of activity A owing to conflict between 
strongly excited incompatible activities B and C—has 
been proposed for the displacement activities studied 
by ethologists (cf. Hinde, 1970). However, the hypoth- 
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esis has not been presented in a quantitative way that 
allows for a convincing test. 

It is also clear that the division into terminal and 
interim periods is an oversimplification. At interfood 
intervals greater than 20 or 30 sec the middle of the 
interval seems to be a period when non-induced (fac- 
ultative) activities, such as running, can occur. The 
growth of this period with interval duration parallels 
the shift from “break and run” to “scallop” on fixed- 
interval schedules as interval value increases (Schnei- 
der, 1969). The “scallop” period, when the cumulative 
record shows a gradual transition to a high, steady 
rate, may correspond to the “facultative” period in 
Figure 4. Possibly activities such as preening and 
grooming, which are not sensitive to food reinforce- 
ment contingencies (Shettleworth, 1973) and do not 
compete with other activities (“disinhibited” activ- 
ities; McFarland, 1970), can occur at this time. 

There are other puzzling facts that emphasize how 
little we understand the mechanisms underlying in- 
duced behavior. For example, pigeons trained to peck 
for food on a fixed-interval schedule will continue to 
do so if periodic food delivery is maintained indepen- 
dently of responding. However, if the interfood inter- 
val is relatively long, pecking often becomes confined 
to the middle of the interval (Shull, 1970; Staddon & 
Frank, 1975a). Is this pecking different from food- 
related pecking, such as that induced in auto shaping 
situations, which tends to occur with highest prob- 
ability close to food delivery? Or is it simply induced 
by the food situation in a similar way to induced 
drinking, where, at low food rates, each food delivery 
seems to produce a more or less constant amount of 
drinking? A final problem is the role of stimulus con- 
trol and “conditioning.” A behavior induced orig- 
inally for one reason may be maintained for another. 
Thus an interim activity that develops in the post- 
food period because this period signals the absence of 
food may subsequently come under partial control by 
features of the environment (as drinking comes under 
the control of the water-bottle, for example). Conse- 
quently, a change in the animal’s environment may 
affect terminal and interim behaviors differently. This 
differential effect may explain the effects of restraint 
on “temporal discrimination” described below. 


TEMPORAL AND SEQUENTIAL 
STRUCTURE OF INDUCED ACTIVITIES 


All the species commonly used in operant condi- 
tioning experiments, rats, pigeons, monkeys, and 
people, adapt the temporal pattern of their behavior 
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to the temporal pattern of reinforcer delivery. Indeed, 
the temporal pattern of the instrumental response was 
the aspect of reinforcement-schedule performance that 
first attracted attention (Ferster & Skinner, 1952), 
and it continues to provide a topic for dozens of re- 
search reports each year. This temporal regularity 
raises two obvious questions: (a) What determines the 
temporal locus of the various activities? (b) What is 
the nature of the “clock” that times these activities? I 
take up these two questions first. The third section 
deals more generally with types of sequential inter- 
action. 


Factors Affecting the Temporal Locus of 
Terminal and Interim Periods 


Most studies of the relation between the temporal 
sequence of reinforcers and the temporal sequence of 
behaviors have been concerned only with the pattern 
of the instrumental response. As we have already seen, 
periodic reinforcement generally produces a corre- 
sponding periodicity in behavior, with the instru- 
mental response occupying the last third or so of the 
inter-reinforcement interval. There are some excep- 
tions, in situations with weak reinforcers or labile re- 
sponses (e.g., Weiner, 1969), and many species (fish, 
octopus) apparently fail to show this kind of temporal 
adaptation. However, it is sufficiently common that 
the search for rules to describe it seems justified. 

The rule seems to be that the local rate of respond- 
ing (i.e., rate over some short time interval) is directly 
related to the relative proximity to reinforcement or, 
on aperiodic schedules, the relative density (i.e., rate 
over some brief time interval) of reinforcement 
(Catania & Reynolds, 1968; Jenkins, 1970; Staddon, 
1972b; Zeiler, Chapter 8 in this volume). In short, 
pigeons peck more at times when food is more likely. 
No one has yet succeeded in reducing this rather 
common-sensical principle to a mathematical form 
that is universally satisfactory (see de Villiers, Chapter 
9 in this volume, on the quantitative law of effect), 
but the details are not important for present purposes. 

This relative proximity rule should not be thought 
of as a causal law about the effect of independent 
reiforcement variables on dependent behavioral 
ones. It is an equilibrium principle, that describes the 
steady-state relation between reinforcement and_ be- 
havior, once behavior has settled down. On response- 
contingent schedules reinforcement rate is affected by 
behavior (as well as vice versa) and often many 
equilibria are possible. For example, a pigeon on a 
high-ratio schedule may cease to respond because its 
initial response rate is too low to produce sufficient 
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reinforcement to sustain pecking. This outcome, with 
zero reinforcement supporting zero responding, is just 
as consistent with the law of effect as an equilibrium 
in which a high response rate is maintained by a high 
reinforcement rate. By itself the law of effect does not 
predict which will occur. Other factors, involving the 
historical development of the final equilibrium, and 
including factors that may facilitate or inhibit in- 
duced behavior, must be understood for a full ac 
count. 

We have already seen that even when no instru- 
mental response is required, behavior can usually be 
subdivided into terminal and interim classes, with the 
(induced) terminal response apparently following the 
same set of rules as if it were an instrumental re- 
sponse. As in other cases discussed in this chapter, 
most of the evidence comes from the study of a few 
behaviors in a few species. Pecking in pigeons seems to 
follow the density of food delivery in the same way on 
non-contingent as on response-contingent schedules, at 
least under the restricted conditions studied by Stad- 
don and Simmelhag (1971). However, under other 
conditions, particularly when (free) food delivery 1s 
relatively infrequent (less than one per 2 min or so), 
this relation breaks down. Instead of occurring in 
anticipation of food delivery, pecking may occur in a 
burst in the middle of each interval (Shull, 1970; 
Staddon & Frank, 1975a). Unfortunately the observa- 
tional work necessary to decide whether or not there is 
another response, other than pecking, that anticipates 
food in these cases (i.e., a terminal response) has not 
been done, although it is clear that pecking is not in- 
duced on long fixed-time schedules as it is on short 
(Simmelhag, unpublished observations). 

Interim activities occur at times when the terminal 
response is not occurring. ‘There seem to be two con- 
ditions under which a terminal response fails to occur 
(or occurs at a reduced rate): (a) When relative rein- 
forcement rate is low, but there is the opportunity for 
reinforcement at any time (e.g., variable-interval 
schedules); (b) when there is no reinforcement oppor- 
tunity (overall reinforcement rate may be high or low; 
the period immediately following food on fixed-inter- 
val schedules, and following a response on spaced-re- 
sponding schedules, are examples). In case (b), the 
terminal response does not occur at all during S4 
periods, which are the occasion for observable interim 
activities. However, in case (a), if interim periods can 
be said to exist at all they must be brief and inter- 
spersed between occurrences of the terminal response: 
variations in rate then correspond to variations in the 
percentages of time taken up by the terminal and 
interim periods (see footnote 4, p. 144, and Rachlin, 
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1973). Because interim periods must be short under 
these conditions, there is no opportunity for full- 
blown interim activities, and thus no easy verification 
of the hypothesis. 

With the exception of a number of experiments on 
quantitative aspects such as duration (Schwartz & Wil- 
liams, 1972) and force (Chung, 1965; Cole, 1965), 
study of the topographic details of pecking on various 
schedules has not been extensive. In informal observa- 
tions I have noticed that on variable-interval sched- 
ules pigeons generally turn away from the key between 
pecks, whereas pecking on ratio schedules, or during 
the terminal “run” on fixed-interval, is much more 
single-minded, with little turning away. It does not 
seem far-fetched to interpret this turning away (and 
thus the gap between pecks associated with it) as an 
interim period, with properties similar to interim 
periods on periodic schedules. 


Timing of Induced Sequences 


‘The reliable association between induced interim 
activities and periodicity of the terminal, generally 
instrumental response, has led to a number of at. 
tempts to find a causal link between the two (e.z., 
Glazer & Singh, 1971; Hodos, Ross, & Brady, 1962; 
Nevin & Berryman, 1963; see Harzem, 1969, and 
Kramer & Rilling, 1970, for reviews). Is the periodicity 
of the terminal response on, say, a fixed-interval sched- 
ule caused in some way by the regular sequence of 
interim activities that typically precedes it? ‘This ques- 
tion cannot be answered until the mechanism by 
which interim activities might serve to “time’’ the 
terminal response is made more explicit. Suggestions 
in previous published work are of two general kinds: 
chaining explanations, and ‘behavioral clock’ ex- 
planations. 


CHAINING 


Chaining is the simplest possibility. The model is 
the “domino theory,’ according to which each be- 
havior in the temporal sequence A-B-C-D- etc. directly 
produces the next: the offset of activity A produces 
the onset of B, the offset of B the onset of C, and so 
on. In the usual form of this explanation, each indi- 
vidual activity is assumed to take a characteristic time, 
1e., the distribution of bout durations will show a 
mode at a “preferred” duration (see McGill, 1963, for 
a review of stochastic processes and “temporal dis- 
crimination’’). ‘This is not necessary, however. If the 
number of links in the chain is fixed, the time from 
the beginning of the chain to the onset of a given 
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later member will not be random (1.e., exponentially 
distributed) even if each link has a random duration. 
This follows directly from the central limit theorem, 
since the time of onset of chain link M is equal to the 
sum of the durations of links 1 through M-1. The 
more intervening links, the more sharply peaked will 
be the distribution of times of onset of an activity late 
in the chain. Thus, whether or not each activity is 
intrinsically timed, the chaining mechanism can 
nevertheless result in “temporal discrimination,” de- 
fined as a peaked, nonrandom distribution of starting 
times, for an activity late in the chain. 

Unfortunately, chaining explanations do not fit the 
facts for behavior sequences induced by periodic 
schedules. There are three kinds of evidence that pose 
problems: 

(a) Interim activities such as drinking are often 
repetitive. Thus the kind of chain actually observed is 
closer to A-A-A-...-Q (a homogencous chain) 
than to the A-~B-C- . . . -O (heterogeneous chain) of 
the model. What determines run length in the homo- 
geneous chain? No merely probabilistic process will 
suffice to make the length of the chain other than 
random (i.e., independent of time); some kind of 
counting mechanism is required. Yet there is no 
evidence that animals can count better than they can 
time, so that an explanation of timing in terms of 
counting is unsupported. 

(b) In a simple chain, the necessary and sufficient 
cause of activity N is the occurrence of activity N-l, 
Hence prevention of activity N-1 should eliminate 
activity N. This is not what happens in temporal be- 
havior sequénces, For example, climination of the 
water bottle normally present during a fixed-time food 
schedule usually causes rats to make the terminal re- 
sponse cartier in the interval (Staddon & Ayres, 1975). 
Many experiments, using both spaced-responding and 
periodic schedules, have shown that prevention of 
interim activities disrupts temporal discrimination by 
causing the instrumental response to occur too soon 
(e.g., Frank & Staddon, 1974; Glazer & Singh, 1971; 
Laties, Weiss, Clark, & Reynolds, 1965; Laties, Weiss, 
& Weiss, 1969).? Since the terminal response is usually 
preceded by interim activities such as drinking, and 
elimination of these activities if anything facilitates 
the terminal response, they cannot be links in a chain 
that ends with that response. 


3 ‘This discussion considers spaced-responding (DRL) schedules 
on the same basis as interval schedules. In the spaced-responding 
case, the terminal response is timed from the previous terminal 
response, whereas in interval schedules it is timed from rein- 
forcement, but performance on both schedules seems to be 
similarly affected by (for example) prevention of interim 
activities. 


SCHEDULE—INDUCED BEHAVIOR 


It is perhaps worth noting that most attempts to 
explain wholly endogenous behavior sequences by 
means of behavior or reflex chains have been unsuc- 
cessful. For example, insect flight and walking pat- 
terns, once thought to depend on chain reflexes, have 
been shown to involve central programming (e.g., 
Wilson, 1961, 1966). However, chain accounts have 
been quite successful in explaining behavior sequences 
incorporating extrinsic stimuli, such as the “lock and 
key” courtship sequences described by Tinbergen 
(1951), and the hunting behavior of the wasp Philan- 
thus triangulum and many other invertebrate preda- 
tors (see Hinde, 1970, for a review). In the context of 
operant behavior, the chaining concept arose in con- 
nection with chained schedules. These parallel the 
ethological examples just mentioned, in that responses 
produce external stimuli that in turn produce other 
responses. It seems prudent to reserve chaining ac- 
counts specifically for situations in which the succes- 
sive stimuli are provided by the environment, with 
only the response elements of the chain being con- 
tributed by the animal. 

(c) Induced sequences show both variability of 
succession (A is not always followed by B) and 
temporal variability (B does not always occur at, or 
for, the same time). Both these features are incom- 
patible with simple chaining, but might be modeled 
by a Markov process (Cane, 1959, 1961; Staddon, 
1972a). However, analyses of behavior sequences in 
both pigeons (Staddon, 1972a) and rats (Staddon & 
Ayres, 1975) show that even a Markov account is not 
adequate, at least in a simple form. The essential 
property of a Markov process is that each state (activ- 
ity) is dependent only on the preceding one. There- 
fore there should be no dependence of the onset (or 
offset) of an activity on time, other than the time 
elapsed since the preceding activity (or since the be- 
ginning of the activity). Yet on fixed-time schedules 
both pigeons and rats show such dependencies. The 
time between two successive activities tends to be 
shorter the later the first activity ends within the inter- 
val; and the duration of a given activity tends to be 
shorter the later it begins. 

Davey, Harzem, and Lowe (personal communica- 
tion) report that “running” rate on fixed-interval 
schedules (i.e., rate of lever pressing following the first 
press in each interval) is directly related to pause 
(time to the first lever press): the later the rats begin 
to press, the faster they go. Pigeons showed no effect 
of pause on overall running rate. In subsequent ex- 
periments Staddon and Frank (1975b) have found 
that the rate at which many pigeons accelerate to their 
fixed terminal rate depends on pause: the longer the 
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pause, the more rapidly they accelerate. ‘The low rate 
at which they start, and the high rate that they finish 
up with, is more or less constant, but the time they 
take to get from one to the other decreases as the time 
available for responding before food delivery de- 
creases. 

All these observations underline the dependence of 
induced activities on the time elapsed since the begin- 
ning of the fixed interval. Chaining cannot easily ac- 
count for this dependency. 


BEHAVIORAL CLOCKS 


“Behavioral clock’ interpretations are less explicit 
than the chaining account. They are based on the fre- 
quent observation that prevention of interim (“col- 
lateral’) activities disrupts ‘temporal discrimination” 
(1.e., pausing) on fixed-interval and spaced-responding 
(DRL) schedules. The idea that the animal “uses” the 
interim activities to suppress the terminal response for 
. a time is little more than a restatement of this obser- 
vation. One refinement is to attribute to the collateral 
activities an intrinsic periodicity, so that they serve the 
function of a behavioral clock. In this form the 
hypothesis resembles chaining. The difference 1s that 
the stimuli (“causal factors’) for the terminal re- 
sponse are assumed to be present all the time; the 
response fails to occur early in the interval only be- 
cause it is suppressed by the “collateral” behaviors that 
constitute the clock. When they have run their course, 
the terminal response occurs. ‘The mechanism in this 
case is a type of disinhibition, whereas in chaining the 
terminal response is directly produced (elicited, con- 
trolled) by the penulumate behavier in the chain. 
However, the arguments against chaining apply also 
to this form of behavioral clock, In particular, the 
negative correlation between the offset of the last m- 
terim activity and the onset of the terminal response 
shows that the terminal response is directly affected 
by post-food time. Although evidence from prevention 
experiments shows that interim activities do exert 
some suppressive effect on the terminal response (since 
their elimination causes the terminal response to 
occur earlier), the correlation data show that this dis- 
inhibiting effect is not the sole determiner of the 
temporal locus of the terminal response; some kind of 
“internal clock” is also involved. 

Although inhibition due to interim activities 1s not 
the only factor affecting the timing of the terminal 
response, it may be a factor. A further refinement of 
the behavioral clock view is to suppose that some 
measure of temporal discrimination is a function of 
some property of the interim activities—vigor, rate, 
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etc. This relation can be looked at either within or 
across individuals. Within individuals it is plausible: 
suppression of interim activities disrupts a developed 
temporal discrimination (Frank & Staddon, 1974; 
Schwartz & Williams, 1971), and interim activities 
such as drinking tend to develop in step with a ter- 
minal response such as food anticipation during train- 
ing (eg., Pouthas & Cave, 1972; Staddon & Ayres, 
1975). ‘There is less evidence for a correlation across 
individuals. For example, Smith and Clark (1974) 
found no correlation between rates of running or in- 
duced licking and efficiency of performance on spaced- 
responding schedules, However, Glazer & Singh (1971) 
found that temporal discrimination was inversely re- 
lated to degree of restraint in three groups of rats that 
were either unrestrained, partially restrained, or 
severely restrained. In informal observations we have 
noticed that pigeons trained in small Skinner boxes 
sometimes fail to show the typical fixed-interval “‘scal- 
lop” and respond more or less continuously; animals 
trained in the usual large boxes rarely show this 
pattern. These various experiments cannot be rigor- 
ously compared, because of species differences and be- 
cause amount of training obviously interacts with 
these differences. Nevertheless, taken together there is 
much evidence that temporal discrimination is fay- 
ored by an environment that affords animals oppor- 
tunities for interim activities. 

Although there is evidence for some relation be- 
tween temporal discrimination and interim activities, 
no particular interim activity 1s necessary for appro- 
priate timing. While many authors report vigorous 
“collateral” behaviors on temporal echedulas (2.6., 
Hendry & Dillow, 1966; Laties ct al., 1965; Zurifl, 
1969), others report none (e¢.g., Anger, 1956; Kelleher, 
Fry, Xe Cook, 1959: Reynolds & Catania, 1962). And 
although pigeons trained under unrestrained cond:- 
tions show the expected disruption when shifted to 
conditions of bodily rectraint, the disruption ig tran- 
sient and after protracted training there is ttle steady- 
state difference (Frank & Staddon, 1974). Presumably 
the transient disruption occurs because interim actiy- 
ities possible under free conditions are prevented 
when the pigeon is restrained. Evidently other interim 
activities soon develop, however, since behavior re- 
covers to almost the same level as before the shift. 
Frank and Staddon also found a disruption when 
birds trained under restrained conditions were shifted 
to free conditions. ‘This disruption is harder to ex- 
plain in terms of prevention of previously available 
interim activities. However, it can be understood in 
terms of a wider scheme for classifying sequential 
interactions, to which I now turn. 
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Types of Sequential Interaction 


It has not yet proved possible to explain in detail 
the mechanisms underlying induced, or indeed any 
other, behavior sequences. Short of such a complete 
explanation, terms such as inhibition, disinhibition, 
elicitation, and causal factors have become current in 
the animal behavior literature as a way of classifying 
types of sequential interaction. ‘These terms can be 
employed in several ways. The present section devel- 
ops an approach that is consistent with the facts 
already discussed and suggests questions that can be 
answered empirically. ‘This approach is closely related 
to the more formal state-space approach recently elab- 
orated by McFarland (McFarland, 1974; McFarland & 
Sibly, 1975; Sibly & McFarland, 1974) and to the 
theoretical system of Atkinson & Birch (1970). 


DEFINITIONS 


Behavioral State. Evidence already discussed shows 
that the overt activity that is actually observed (per- 
formance) is only one aspect of an underlying be- 
havioral state. ‘Terms that convey aspects of the term 
state, in the sense used here, are ‘‘mood,” “expectation,” 
“motivational state,” and even “operant,” in the sense 
that an operant is a class of behaviors with common 
controlling factors. Bindra’s (1969) “central motive 
state” is also close to the present meaning. States are 
mutually incompatible; they are the basic interacting 
elements in this scheme. 


Activity. This is an observed class of motor pat- 
terns; it has both stimulus and response components. 
These motor patterns (e.g., pecking, drinking, pacing 
in a particular place, etc.) are necessarily defined sub- 
jectively, but little practical difficulty is usually en- 
countered in settling on reliable categories. A state 
exists independently of any particular activity, but the 
performance of an activity may act back on the 
strength of a state (i.e., on the level of its causal fac- 
tors), cither increasing it (“momentum” effects; posi- 
tive feedback) or decreasing it (self-inhibition: nega- 
tive feedback). Providing the environment 1s constant, 
activities are assumed to be generally reliable (one-to- 
one) indicators of their associated states, and the 
terms activity and state are treated as equivalent in 
the following discussion. 


Causal Factors. ‘These are environmental factors 
affecting the strength of states. Causal factors are as- 
sumed always to be facilitatory, so that suppression of a 
given activity by a stimulus is assumed to be due to a 
decrease in its causal factors and/or an increase in the 
causal factors for an incompatible activity. The idea 
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that inhibition is due to the activation of an incom- 
patible activity follows directly from the hypothesis 
that interim and terminal states are incompatible 
and has gained some currency in studies of condi- 
tioned inhibition in Pavlovian situations (e.g., Anok- 
hin, 1974; Konorsky, 1967). The “strength” of a state 
(activity) is equivalent to the strength of its causal 
factors. Examples of causal factors are discriminative 
and eliciting stimuli, time, and antecedent activities 
(as in chaining). 


COMPETITION ASSUMPTION 


For simplicity, I assume that the animal can be in 
only one state at a time. Thus, states compete for ac- 
cess to what might be termed the behavioral final 
common path.* In the following discussion it is as- 
sumed that this competition is all at one level, every 
state (activity) competing directly with all the others 
that are possible in the situation. However, the 
scheme can easily be generalized to allow for hierar- 
chical or other multilevel interactions, with a given 
state competing directly only with states at its own 
level. 

This scheme suggests a taxonomy of simple be- 
havioral interactions. I first develop such a taxonomy, 
and then apply it to some of the data and concepts 
discussed earlier in the chapter. 


SIMPLE SEQUENTIAL INTERACTIONS 


This view allows for two kinds of simple inter- 
action between successive behaviors. It is assumed 
that the shift from one behavior to another is owing 
to a change in only one causal factor, which either in- 
creases (inhibition, elicitation) or decreases (disinhib1- 
tion, subduction) with time. These two types of inter- 
action define four terms, as follows: 


Inhibition. ‘This occurs when activity A ceases to 
occur because of an increase in the causal factors 


4 Quantitative variations in response rate can be handled 
within this scheme by assuming that repetitive activities occur at 
a more or less fixed, maximum rate, so long as the animal is in 
the appropriate state, and that variations in rate occur because 
of switching between states. There is evidence for this kind of 
fixity in the case of drinking: licks occur at a more or less fixed 
rate within each bout (e.g., Marowitz & Halpern, 1973). Varia- 
tions in overall lick rate are therefore associated with a propor- 
tional increase in the percentage of time spent in the licking 
state. There is some evidence that even “operant” behaviors 
such as pecking are similarly constrained, although perhaps not 
to the same degree (Blough, 1963; Gilbert, 1958). The notion is 
hard to test unless typical bout lengths are considerably longer 
than the modal interbehavior interval. ‘The problem of defining 
the length of an activity bout, discussed by ethologists (e.g., Isaac 
& Marler, 1963; Nelson, 1973), confronts essentially the same 
issue. 
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Fig. 6. Simple dyadic interactions. 


(CFs) for some other activity, B (the neéxt-in-priority 
activity), When CF, > CF,, A is displaced by B. In- 
hibition is illustrated in Figure 6A. The top line 
shows the observed sequence of behaviors: A followed 
by B. The curves below show the changes in stimulus 
factors (CFs) hypothesized to underlie this change. 
They show that the level of GFs for A remains con- 
stant, but that A is supplanted by B when the CFs for 
B increase beyond the level of those for A. 


Disinhibition. ‘This is simply the reverse of inhi- 
bition: activity B occurs (and activity A ceases to occur) 
because the CFs for antagonistic activity A decrease 
below the level of those for B. This is illustrated in 
Figure 6B. 


Elicitation. ‘The causation of behavior B in Figure 
6A illustrates elicitation: B occurs because its GFs in- 
crease, all other CFs remaining constant. 


Subduction. ‘This is a neologism to describe the 
opposite of elicitation: a behavior ceases because its 
CFs decrease in strength, all other CFs remaining con- 
stant. It is illustrated by the offset of behavior A in 
Figure 6B. 

It is clear that the two kinds of interaction illus- 
trated in Figure 6 are simply extreme cases on a con- 
tinuum of dyadic interactions between successive 
behaviors. For two such behaviors, A and B, the CFs 
for A can either decrease, increase, or remain constant 
with time, and similarly for B. If A is occurring 
initially, a shift to B will occur only if the maintained 
rate of change in CF, is greater than that for CF, 


dt ~ dt 
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(and this differential persists for a sufficient time): 


dA dB 


dA d 
> aan where—— =an (CF,). It is assumed that 


the CF functions are continuous. 


Experimental Analysis. In the examples illus- 
trated in Figure 6 all that is actually observed is a 
switch from one activity to another. The diagrams 
showing changes in CF strength with time constitute 
hypotheses about the underlying causation and must 
be tested by manipulating the putative causal factors. 
In each example, only one CF changes with time; if it 
is under direct experimental control, verification of 
the causal hypothesis is a trivial matter. However, in 
the cases discussed in this chapter, the changing CF 
is usually an inferred “internal clock” that is only 
under indirect experimental control, For example, 
suppose (for the sake of illustration) that behavior A 
is an interim activity such as drinking, and B a ter- 
minal response such as lever pressing, with the switch 
from A to B timed from food delivery. Then the 
changing CF (CF, in Figure 6A, CF, in Figure 6B) 
must also be assumed to be timed from food delivery, 
so that the absolute time of the origin of the curves 
in Figure 6 can be controlled, but not their form. 

Hiow might the two cases in Figure 6 be distin: 
guished experimentally? ‘Che answer to this question 
dépends critically on quantitative 18SUES; the absolute 
values of the strengths of the GFs for the two activ- 
ities. With the values shown in Figure 6 it is apparent 
that complete elimination of activity A (2e.g., by re- 
moving a constant CF net shown in the diagrams: the 
water bottle) will cause activity B to occur earlier in 
the interval under case B than under case A. Similarly, 
removal of a constant CF for activity B (the lever) 
should cause activity A to occur throughout the in- 
terval, under case A, but prolong A only slightly 
under case B. In this simple case, therefore, it is rela- 
tively easy to distinguish between these two hypoth- 
eses. However, if the CFs for both activities change 
with time, or if neither CF ever decreases to zero, dis- 
crimination is much more difficult. One possible tech- 
nique in that case is to introduce a third competing 
activity, such as running, whose CFs can be assumed 
to be constant during the test period. By manipulat- 
ing its strength, an estimate of the relative strengths of 
A and B, as a function of time, might be arrived at. 


Internal Feedback. ‘The diagrams in Figure 6 im- 
ply that the CFs for a given activity do not depend on 
whether or not that activity is actually occurring. Yet 
this kind of independence is unlikely to be general. 
Once begun, an activity may have a certain momen- 
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tum and resist competition from other activities that 
might have been sufficient to prevent its initial occur- 
rence. Conversely, even in the absence of competition, 
most activities cease to occur after a time, presumably 
because of some kind of self-inhibition (Hull’s, 1943, 
“reactive inhibition’; the “consummatory force’ of 
Atkinson & Birch, 1970). ‘The circumstantial evidence 
for these kinds of internal feedback interaction is 
strong. Unfortunately, they are hard to measure 
directly just because the loops are internal ones, 

For example, self-inhibition cannot immediately be 
distinguished from the hypothesis of an internal 
clock. When first exposed to a running wheel, a rat 
may tend to run for a more or less fixed time; is this 
because of a fixed internal clock that times the run 
bout, or because of negative feedback from the re- 
sponse? ‘The second possibility can be evaluated by 
surgical intervention (e.g., deafferentation), or by vary- 
ing the resistance of the running wheel. If these 
operations change the duration of running, some role 
for feedback is demonstrated. In any particular case, 
tests of this sort can usually be devised. However, the 
possibility of self-feedback greatly complicates the 
task of experimental analysis. 

This general problem is a familiar one to students 
of the major homeostatic systems such as food and 
water regulation (sce Satinoff & Henderson, Chapter 6). 
However, much more is known about the internal 
feedbacks that affect the duration of an eating bout, 
for example, than about the comparable factors aftect- 
ing running, preening or lever pressing. The fact that 
the mechanisms underlying such responses are unlikely 
to be fixed, but may depend on the situation in which 
the response occurs (e.g., whether the response is 
schedule-induced, or occurs under free conditions), is 
a further complication. 


Variability. Although schedule-induced behavior 
1s characteristically highly stereotyped, there is never- 
theless variation in its form and, particularly, its 
temporal pattern from one interfood interval to the 
next. Temporal variability (variation in the temporal 
location of an activity as a function of post-food time) 
is always found, but variability of succession (variation 
in the order of activities) is less common. Thus, the 
function relating the strength of causal factors to 
post-food time must be assumed to vary from interval 
to interval. The simplest form of variation is a 
stochastic process with zero mean superimposed on the 
average CF functions (e.g., the curves in Figure 6). 
Since it is assumed that the CF functions change at a 
finite rate (1.e., there are no step-functions other than 


SCHEDULE—INDUCED BEHAVIOR 


those due to the onset of extrinsic stimuli), variation 
in relative levels will cause variation in the time of 
switching from one behavior to another and, if the 
added random variation is large, may even cause re- 
versals of order. 


APPLICATIONS 


Variability, internal feedback, and the possibility of 
competition at several levels, can obviously combine 
to produce sequences of behavior that defy analysis by 
means of simple experimental tests. Bearing these 
complex possibilities in mind, it may nevertheless be 
useful to see to what extent relatively simple interac- 
tions, such as those illustrated in Figure 6, can ex- 
plain the experimental results discussed earlier. Three 
cases will be considered: interactions between running 
and drinking, the effect of interfood interval, and an 
anomalous experimental result owing to the persis- 
tence of running. 


Running /Drinking Interactions. Figure 7A shows 
the postulated time course of CFs for eating (E), 
drinking (D), running (R), and food anticipation 
(FA), underlying the maintained temporal distribu- 
tion of these activities shown in Figure 1B, which 
shows data from a rat on a fixed-time 30 sec schedule. 
The vertical lines marked “P” in Figure 7 show the 
times of pellet delivery. The CFs for running are as- 
sumed to be essentially constant, so that running con- 
stitutes a disinhibited activity which occurs only when 
the strengths of drinking and food anticipation are 
low. One prediction from this hypothesis is that the 
elimination of drinking (e.g., by removing the water 
bottle) should cause running to occur earlier in the in- 
terval, but should have little effect on the time of onset 
of food anticipation. Elimination of running (by re- 
moving the running wheel) should cause food anticipa- 
tion to begin earlier and drinking to persist later in the 
interval. Elimination of drinking and running should 
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Fig. 7. Hypothesis for interactions among behaviors induced 
by periodic food. A: Low to intermediate food rate. B: High 
food rate. 
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cause food anticipation to begin sooner than elimina- 
tion of running alone. These predictions assume that 
the pattern of CFs illustrated in Figure 7 remains 
more or less constant during the test period, i.e., they 
are predictions about transfer effects. 

In general] these predictions are borne out. Elimina- 
tion of running does appear to cause drinking to per- 
sist longer and food anticipation to begin earlier in 
the interval, and elimination of drinking does seem to 
have more effect on running than on food anticipa- 
tion (Segal, 1969; Staddon & Ayres, 1975). However, 
the Staddon and Ayres data come from steady-state 
adjustments to the manipulations rather than transfer 
(first-day) measures, the Segal data do not show 
temporal location of activities, and exact data do not 
seem to be available elsewhere. Hence these predic 
tions cannot yet be precisely evaluated. 


Effects of Interfood Interval. Figure 7B shows the 
postulated interactions on a short (5-15 sec) fixed-time 
food schedule. The CFs for running are the same as in 
Figure 7A, but the CFs for food anticipation rise to 
their asymptote more quickly, because of the shorter 
interval, Consequently, the crossover of the curves for 
drinking and food anticipation occurs at a level 
higher than the level of the GFs for running, which 
cannot therefore occur, apart from the effects of ran- 
dom variation. I’hus the decrease in the frequency of 
running with interfood interval shown in Figure $ is 
explained by assuming that the GF for drinking and 
food anticipation reach the same asymptote in short 
intervals as in long ones, and therefore cross over at a 
level that is inversely related to interval length: the 
shorter the interval, the higher the crossover point. 

Frank and Staddon (1974) found that pigeons 
trained on a periodic schédule under conditiens of 
bodily restraint showed disrupted temporal discrim- 
ination (Le: the terminal response, key pecking, oe- 
curred earlier in the interval) when shifted to unre- 
strained conditions. Figure 7B sheds some light on 
this result if it is assumed that the shift had more 
effect on the CFs (controlling stimuli) for whatever 
interim activity was occurring under restrained condi- 
tions (analogous to drinking in Figure 7B) than on 
the terminal response. This assumption seems reason- 
able since the main CF for pecking was the response 
key, which was not affected by the shift. Figure 7B 
makes it clear that any reduction in the CFs for the 
interim activity, relative to those for the terminal re- 
sponse, will cause the terminal response to occur 
earlier in the interval, as Frank and Staddon found. 


Persistence of Running. Figure 8 shows the frac- 
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Fig. 8 An unusual paiiern of behavior induced in a female 
rat by a 30-sec periodic food schedule, 


tion of time spent in various activities (activity areas) 
by a female rat in a hexagonal apparatus in which it 
received a [ood pellet every 30 sec (Staddon & Ayres, 
1975). The behavior of this animal was anomalous, 2¢ 
can be seen by comparing Fisuye § with Figure 1B. 
The rat in Figure 8 showed running as an apparent 
terminal response, since the activity increased in fre- 
quency up until the tame of the next food delivery, A 
Naive reinforcement theory interpretation might con- 
clude that running in this case was adventitiously 
reinforced by food delivery—with which it was almost 
invariably contiguous. However, this interpretation is 
contradicted by the results of extinction tests, in 
which the first effect of food omission was an increase 
in food anticipation, followed later by an increase in 
running above the level observed under the food 
schedule. 

Figure 9 shows what may be a more accurate repre- 
sentation. ‘The curves are identified as in Figures 6 
and 7. However, the abscissa is post-eating (rather 
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Fig. 9. Hypothesis for interactions underlying behavior shown 
in Figure 8. 


than post-pellet-delivery) time, since this animal did 
not always eat the pellet as soon as it was delivered. 
The main difference between Figure 9 and Figure 7A 
is in the postulated CF function for running, which 
shows positive acceleration once the response is ex- 
pressed (a positive feedback, Fb+, “momentum” 
effect), and declines with the continued occurrence of 
running (a negative feedback, Fb—, “self-inhibition” 
effect). A consequence of the momentum effect is that 
when the food pellet is delivered, the CFs for running 
are above those for eating. The pellet is not eaten 
until the CFs for running decrease (self-inhibition) be- 
low those for food anticipation, the animal enters the 
feeder area, sees the food, and eats. 

The model predicts the initial perseveration of 
food anticipation in extinction as well as the eventual 
dominance of running. It also predicts that the 
temporal pattern of food anticipation will approach 
that shown by the other rats (e.g., the animal in 
Figure 1B) if drinking is eliminated (allowing running 
to cease earlier in the interval), and should be essen- 
tially the same as the pattern shown by the others if 
running is eliminated. The data support both these 
predictions. Other features of the model for this 
animal have not been tested in detail (for example, 
the proper tests to distinguish response-produced 
feedbacks from response-produced clocks were not 
carried out). However, the basic hypothesis that the 
anomalous behavior of this individual is largely, per- 
haps entirely, attributable to its tendency to run seems 
quite well supported. 

Once the mechanisms underlying behavioral se- 
quences are fully understood, the representations in 
Figures 6, 7, and 9 will undoubtedly appear cumber- 
some and redundant. However, these diagrams force 
one to be precise about often loosely applied terms 
such as inhibition. They also suggest clear experi- 
mental tests. Without such a framework it is often 
dificult to ask good experimental questions about 
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temporal sequences, or to interpret the results of 
experimental manipulations. 


CONCLUDING COMMENTS 


The experimental literature on schedule-induced 
behavior, though extensive, is too unsystematic to 
point clearly to any particular theoretical integration. 
This chapter is an attempt to provide an organizing 
framework to guide both experimentation and inter- 
pretation. 

The traditional emphasis on a single, instrumental 
response 1s misplaced. The work reviewed here shows 
that the temporal pattern of any activity depends on 
its interactions with other activities that are induced 
by the situation. These induced behaviors must be 
considered on the same footing as the instrumental 
response. ‘hey are often just as vigorous (sometimes 
even more vigorous), are as reliably produced, and 
share some of the same causal factors. The “Jaws” 
of operant behavior are not a property of isolated re- 
flexes, but emergent properties of a set of interactions 
among induced states and their associated behaviors. 
Each behavior has its own controlling (causal) factors, 
both stimuli and time. Any environmental change 
affects the instrumental response both directly, and 
indirectly through its effects on other causally related 
activities. The effects of such changes cannot be fully 
understood until these interactions have been teased 
out. 
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Thermoregulatory Behavior’ 


INTRODUCTION 


‘Temperature regulation is a9 homeostatic process 
which is mainly behavioral, but for many years be- 
havior was largely ignored in its analysis. There are 
two probable reasons for this omission. First, no con- 
venient method for quantifying thermoregulatory be- 
havior was available until 1957. In that year, Carlton 
and Marx in one study and Weiss in another demon- 
strated that operant behavior was precisely attuned to 
regulating body temperature. They showed that rats 
in the cold would press a bar that turned a heat lamp 
on, thereby preventing a fall in internal temperature.1 


* The preparation of this chapter and some of the un- 
published research described in it was supported by Research 
Grant #NS 12033 from the National Institute of Neurological 
Diseases and Stroke and Grant #CRR Psychology from the Uni- 
versity of Ilinois Research Board to the first author. We thank 
Drs. R. D. Luce, H. Rachlin, B. Schwartz, and especially W. 
Honig and J. Staddon for their helpful comments on previous 
versions of this manuscript. 


1 The effectiveness of thermal reinforcement has been demon- 
strated in various species, including baboons (Gale, Mathews, & 
Young, 1970), macaques (Carlisle, 1970), squirrel monkeys (Adair, 
1970; Carlisle, 1966), dogs (Cabanac, Duclaux, & Gillet, 1970), 
cats (Clark & Lipton, 1974; Weiss, Laties, & Weiss, 1967), pigs 
(Baldwin & Ingram, 1967), rats (Epstein & Milestone, 1968; 
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Second, it was assumed that when sufficient informa- 
tion about the neural control of reflexive thermoreg- 
ulatory responses was obtained, it would account fox 
operant behavier as well, The first purpose of this 
chapter is to show that the neural controls of thermo: 
regulatory reflexes and operants are functionally and 
neureanatomically separate and that we can never 
fully understand thermal homeostasis without under: 
standing its operant aspects. 

Operant behavior provides an clegant means of 
tapping important features of thermoregulation. A 
second purpose of this chapter is to demonstrate the 
utility of behavior in interpreting the effects of vari- 
ous drugs on body temperature, in analyzing thermal 
preterences, and in studying phylogenetic and onto- 
genetic differences in thermoregulatory functioning. 

Thermoregulation, because it is an exemplary nega- 
tive feedback system, is usually discussed within the 
framework of control theory. One of the most impor- 


Lipton, 1968; Matthews, 1969; Weiss & Laties, 1961), mice 
(Baldwin, 1968; Revusky, 1966), Barbary doves (Budgell, 1971), 
chicks (Zolman, 1968), lizards (Hammel, Caldwell, & Abrams, 
1967), alligators (Davidson, 1966), and goldfish (Rozin & Mayer, 
1961). 
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tant concepts in control theory is the set poimt—that 
value of the input at which the output is zero. ‘This 
chapter will show how behavior is an invaluable tool 
in determining, when body temperature changes, 
whether the change is due to a shift in set point. 


SEPARATION OF OPERANT AND 
RESPONDENT TEMPERATURE REGULATION 


For many years integrated control of body tempera- 
ture was thought to depend on the integrity of two 
areas of the brain, the preoptic/anterior hypothalamic 
area (for brevity, we shall call this the preoptic area) 
and the posterior hypothalamus. Damage to the pre- 
optic area produced animals that could not reflex- 
ively maintain their body temperatures within normal 
limits when placed in hot environments (Teague & 


Ranson, 1936); after posterior hypothalamic lesions 
animals were able to regulate normally in the heat, 
but became hypothermic in the cold (Keller, 1963; 
Pachomov, 1962). This concept of two equal and 
opposing centers, one in the posterior hypothalamus 
controlling heat production and the other in the pre- 
optic area controlling heat loss, gradually gave way to 
a theory in which the preoptic area was preeminent in 
temperature regulation. ‘The change occurred for a 
number of reasons: 


1. Thermally sensitive units (that is, neurons whose 
firing rates are greatly influenced by their temper- 
ature) were found in much greater abundance in 
the preoptic area than in the posterior hypothala- 
mus (Edinger & Eisenman, 1970; Eisenman & Jack- 
son, 1967). 

2. Heating the preoptic area caused sweating, pant- 
ing, vasodilation, and all other autonomic correlates 
of heat loss with a concomitant fall in body temper- 
ature (Magoun, Harrison, Brobeck, & Ranson, 
1938; Proppe & Gale, 1970). This would be ex- 
pected from a heat loss center. But cooling the 
preoptic region caused the heat-producing responses 
of shivering and increased metabolic rate, the heat 
loss response of vasoconstriction, and a rise in body 
temperature (Hammel, Hardy, & Fusco, 1960; 
Morishima & Gale, 1972; also see Satinoff, 19774, 
for review). This: would not be expected. Heating 
or cooling the posterior hypothalamus did not 
elicit any of these responses (Adair & Hardy, 1971; 
Freeman & Davis, 1959). 

3. After lesions in the preoptic area, animals were 
unable to regulate their body temperatures reflex- 
ively in either warm or cool environments (Carlisle, 
1969; Satinoff, 1974; Satinoff & Rutstein, 1970; 
Squires & Jacobson, 1968). 


THERMOREGULATORY BEHAVIOR 


‘These facts imply that the preoptic area is impor- 
tant for maintaining body temperature by activating 
reflexive responses. In most of the experiments lead- 
ing to this conclusion, the animals had no opportu- 
nity for operant control of their temperature. Except 
for occasional observations on postural changes, such 
as huddling or sprawling (Freeman & Davis, 1959; 
Hellstrom & Hammel, 1967), measurements were 
made only of body temperature and of such reflexes as 
shivering, panting, and changes in vasomotor tone. 

In 1964, Satinoff combined the neurophysiological 
technique of cooling the brain with the operant 
measure of bar pressing for radiant heat. She found 
that cooling the preoptic region of rats elicited not 
only shivering and an increase in body temperature, 
but operant responding for heat as well. As with re- 
flexes, skin and brain temperatures interact in con- 
trolling operant responding. Brain cooling increases 
the rate of working for heat much more in cold than 
in neutral environments. Conversely, when their hypo- 
thalamus was warmed, rats decreased responding for 
external heat in the cold (Carlisle, 1966; Corbit, 1970; 
Murgatroyd & Hardy, 1970). 

Because preoptic thermal stimulation produces 
both reflexive and operant responding, it is reasonable 
to expect that damage in that area would eliminate 
both types of controls. Although such damage impairs 
reflexive responses, it does not impair thermally moti- 
vated instrumental responding. Lipton (1968) demon- 
strated that rats with preoptic lesions would, when 
placed in a hot environment, turn a heat lamp off and 
a cooling fan on, thereby avoiding death from over- 
heating. Carlisle (1969) later showed that such 
lesioned rats pressed at a much higher than normal 
rate for heat reinforcement in the cold and were able 
to prevent severe hypothermia. Satinoff and Rutstein 
(1970) tested rats with preoptic lesions in a 5°C 
chamber twice a week. In one of the weekly sessions 
no bar was available; body temperatures fell an aver- 
age of 2.4°C in a I-hr session for at least two months 
postoperatively. In the other session, which lasted for 
2 hr, holding a bar down kept a heat lamp on, and 
the rats depressed the bar 32% of the time, main- 
taining their temperatures within .7°C of normal. 
Controls kept the bar depressed only .05% of the 
time (Figure 1). 

These experiments demonstrate that operant be- 
havior can compensate for reflexive deficits. In the 
examples cited above, the reflexive deficits were pro- 
duced by hypothalamic lesions. In other experiments, 
reflexive deficits were caused by thyroidectomy (Laties 
& Weiss, 1959), vitamin deficiency (Weiss, 1957; Yeh & 
Weiss, 1963), or various drug treatments (see page 
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Fig. 1. Duration of heat-on time for the first eight trials in the 
cold of rats with preoptic lesions and of normal rats maintained 
at 80% normal body weight. (From Satinoff & Rutstein, 1970. 
1970 by the American Psychological Association. Reprinted by 
permission.) 


167), and rats also léarned to compensate through 
instrumental behavior, 

It further appears that the neural networks con- 
trolling reflexive and operant thermoregulatory re- 
sponses are to a large degree independent of onc 
another. Operant responses are not integrated solely 
in the preoptic area because they continue to appear 
when that region is largely destroyed. Additional 
evidence for this independence is that lateral hypo- 
thalamic lesions can disrupt thermoregulatory oper- 
ants without affecting reflexive regulation (Satinoff & 
Shan, 1971). Well-trained rats that had pressed a lever 
that turned on a heat lamp in the cold no longer did 
so after lateral hypothalamic lesions. Most of the 
animals were nonetheless able to maintain their body 
temperature reflexively. The operant deficit was not 
always accompanied by impairments in feeding or 
drinking. When it was, the behaviors recovered at 
different rates. For instance, in Figure 2, rat SY18 re- 
gained its preoperative body weight in 10 days, yet it 
did not bar-press for heat at preoperative levels until 
over 40 days had elapsed. 

Of course, there are several ways in which a treat- 
ment may result in a loss of (or decrement in) respond- 
ing. These include general debilitation, forgetting, 
impairments in arousal, motor, or sensory processes, 
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Fig. 2. Effect of lateral hypothalamic lesions on body weight, 
core tempcrature, responding for heat, and shock avoidance. 
Shaded area indicates period of tube feeding; T, pre—rectal 
temperature immediately before the I]-hr test; T,, post—rectal 
temperature at the end of the test. (From Gannon. Shan, 1971, 
© 1971 by the American Psychological Association. Reprinted. by 
permission.) 


or any of these in combination. In this case, we can 
rule out debilitation, forgetting, and motor problems. 
On some tests in the cold, rats were injected with 
quinine HCl, a drug that lowered their internal 
temperature by interfering with shivering (Satinoff, 
unpublished research). On those tests the rats re- 
sponded at preoperative levels, whereas their response 
rates returned to near zero on nondrug days. This 
demonstrates that the rats were able to make the re- 
sponse and had not forgotten how. Sensory deficits 
possibly contributed to the loss of responding. Rats 
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with lateral hypothalamic lesions may not be as sensi- 
tive as normals to skin temperature changes (or skin 
temperature pathways may be damaged by the lesion 
so the signal is inaccurate), and it may require the 
addition of a fall in internal temperature to get them 
working for heat. ‘There may well also have been a 
deficit in arousal, and the lesions may have elevated 
the threshold for operant behavior. However, since 
thermoregulatory reflex adjustments are made during 
sleep, arousal level presumably would not have 
aftected these components. Thus it appears that 
thermally motivated operant behavior depends on a 
distinct neural system passing through the lateral 
hypothalamus. 

Operant and reflexive thermorcgulation appear to 
be uncoupled in the posterior as well as the lateral 
hypothalamus. Although local heating or cooling in 
that arca docs not clicit reflexive thermoregulatory 
responses (Adair, 1974: Freeman & Davis, 1959), these 
treatments do alter operant thermoregulation, When 
squirrel monkcys were given control over their am- 
hient temperature, they selected higher air tempera- 
tures when the posterior hypothalamus was cooled 
and lower air temperatures when it was warmed. This 
operant regulation was just as precise as when the pre- 
optic region was heated or cooled (Adair, 1974). ‘These 
results are compatible with the decreased operant re- 
sponding after lateral hypothalamic lesions reported 
above. Ihe lateral and posterior hypothalamus ap- 
pear to be part of the same pathway, which is involved 
in the control of thermoregulatory operants. Lesions 
in the posterior hypothalamus generally lead to more 
drastic deficits, including somnolence and complete and 
possibly permanent adipsia and aphagia (McGinty, 
1969). Lateral hypothalamic lesions cause less severe 
effects: the rats are drowsy instead of totally somno- 
lent (Wampler, 1970), and later their adipsia and 
aphagia recover through stages to relatively normal 
eating and drinking (Teitelbaum & Epstein, 1962). 
The medial forebrain bundle, possibly part of a mech- 
anism which facilitates operant behavior (Stein, 1964), 
includes both the lateral and posterior hypothalamus, 
so it is reasonable that lesions in those areas should 
eliminate and stimulation excite the same sorts of be- 
havior, 

Even though the preoptic area appears to be in- 
volved primarily in respondent thermoregulation, pre- 
optic thermodetectors also affect the operant system. 
In fact, both systems can be conceptualized as in 
Figure 3, which admittedly is a tremendous over- 
simplification with respect to reflexive controls and 
probably with respect to operant controls as well. 
Nevertheless, it adequately accounts for four impor- 
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Fig. 3. Diagramatic representation of the mechanisms for oper- 
ant and respondent temperature regulation. T = tempcrature; 
C=core; LPH= lateral posterior hypothalamus; PO/AH = 


preop tic anterior hypothalamus. 


tant facts: (1) Preoptic thermal stimulation leads to 
coordinated operant and reflexive responses. This 1s 
because the temperature of the preoptic area affects 
neural activity in both preoptic tissue and the lateral 
posterior hypothalamus. (2) Preoptic lesions lead to a 
loss of thermoregulatory respondents while leaving 
operants intact. It is assumed here that operant 
thermoregulation after such lesions depends mainly 
on extrapreoptic temperature receptors. (3) Posterior 
hypothalamic thermal stimulation leads to operant 
responses, but not to reflexive ones. (4) Lateral hypo- 
thalamic lesions eliminate operant responding only. 

In thermoregulation, then, reflexive and operant 
responses to thermal stresses are functionally and 
anatomically separate, and animals can compensate 
for deficits in one system through the mechanisms of 
the other system. 


USING BEHAVIOR TO ASSESS 
REGULATION 


The separation of reflexive and nonreflexive 
thermoregulation appears phylogenetically. Fish, am- 
phibians, and reptiles have highly sophisticated non- 
reflexive means of regulating their body temperature, 
whereas automatic mechanisms are either nonexistent 
or few and inefficient (see Templeton, 1970, for a re- 
view). For this reason, ectotherms? are excellent prep- 
arations for studying homeostasis. One need not dam- 
age the brain to isolate its systems. Instead, we can 


2 The familiar terms for “cold-blooded” and ‘‘warm-blooded” 
animals are poikilotherm (from the Greek poitkilos, ‘varied, 
changing,” and therme, “heat”) and homeotherm (Greek 
homoios, “like’”’). Since all of these animals thermoregulate, more 
appropriate words describe them on the basis of whether the 
heat source is external or internal. Hence we are using the more 
precise terms ectotherm (Greek ektos, “outside’”) and endotherm 
(Greek endon, ‘“‘within’’) (Cowles, 1962). 
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Fig. 4. The percentage of time spent in the sun (open circles) 
and the shade (solid circles) by a horned lizard, Phrynosoma 
coronatum, during August. The values given are midpoints for 
l-hr intervals. (From Heath, 1965. Reprinted by permission of 
the University of California Press.) 


study organisms with a simpler thermoregulatory 
organization (Rozin, 1968). 


Thermoregulation in Estetherms 


For many years if was assumed that vertebrates 
other than mammals and birds could not control their 
body temperatures at all. When lizards, for example, 
were trapped and put into a cage in the laboratory, 
their temperatures fluctuated with that of the sur: 
rounding médium. However, in their natural environ- 
ments these reptiles regulate with a varicty of be- 
havioral mechanisms (Cowles & Bogert, 1944).* Figure 
4 illustrates one common behavior—shuttling back 
and forth between sun and shade. This enables the 
animal to maintain its body temperature within a 
fairly narrow range, generally no more than 3 or 49°C 
(Figure 5: Heath. 1965). Once within this range it can 
attend to business other than thermoregulating. If the 
regulated range were narrower, the lizard would have 
to spend all of its time shuttling back and forth. As 
Heath (1970) points out, such an animal would be a 
very good thermoregulator but a very inefhicient 
lizard, 

3'This may not be characteristic of all vertebrates, however. 
Bogert (1959) noted that when several green iguanas, the largest 
lizards on the American continent, were exposed to direct sun- 
light in the desert in summer, they did not seek shade but sat in 
the sun until they died. One specimen at the San Diego zoo never 
went indoors on cool evenings and had to be taken inside. One 


night it was inadvertently overlooked by its keeper and was 
found the next morning in a state of cold narcosis. 
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Fig. 5. A lizard moves from direct sun (left) into shade (middle) 
at body temperature T,, and it moves from shade to sunlight 
at body temperature T,. The difference in temperature, Tt 
is a refractory range in which the lizard can operate without 


resorting to thermoregulatory activities. The effector output 
Y is here equated with the availability of sun. Y=0O in the 
shade and Y = maximum in direct sunlight. (From Heath, 1970.) 


Because shuttling is a common thermoregulatory 
behavior of lizards in théir natural énvironmeént, it 1s 
relatively casy to study in the laberatery. Hammel, 
Caldwell, and Abrams (1967) demonstrated that blue- 
tongued lizards regulated their imternal temperature 
between 30 and 37°C by shuttling between 15 and 
45°C chambers. In a differeng situation, thg hzards 
were placed in a hot compartment and allowed to 
escape to a cooler one. Increasing the temperature of 
the hot compartment caused the animals to eseape at 
lower colonic temperatures (Myhre & Hammel, 1969). 
Lizards also learned to go to a platform which, when 
depressed by the weight of the animal, turned on an 
overhead heat lamp. Response frequensics imereased 
and response duration decreased with increasing in- 
tensity of the heat reinforcer. As the intensity of the 
heat chanecd, the lizards compensated bshavierally, 
receiving a roughly constant amount of heat per hour 
(Garrick, 1979}. 

Frogs and fish also clearly show a thermoregulatory 
component in their behavior, Frogs selected temper- 
atures from 25-28°C in a thermocline (a long, ther- 
mally graded alley) ranging from 0-40°C (Cabanac & 
Jeddi, 1971). Different species of fish aggregated at 
different points in a thermal gradient (Fry & Hoch- 
achka, 1970). Six species of fish were trained to 
regulate the temperature in their tanks by their 
spatial movements (Neill, Magnuson, & Chipman, 
1972). Swimming into warmer water caused the entire 
tank to heat up, whereas swimming into cooler water 
caused a drop in tank temperature. The fish all kept 
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the tank temperature within a 4—7°C range, the mean 
varying for different species, although a range of 22°C 
was available. 

Such experiments do not guarantee that the be- 
havior is under the control of operant contingencies. 
The behavior may not even be temperature-related, 
but may instead depend upon other stimulus features 
of the environment. For example, if a heat lamp is 
used to generate the thermal gradient, the resulting 
behavior may be controlled by the light, rather than 
the heat, produced by the lamp. To demonstrate 
preference, behavior must shift appropriately when 
the relationship between the thermal stimuli and 
other environmental cues is reversed. Furthermore, 
the behavior may be a form of kinesis or taxis 
(Fraenkel & Gunn, 1961). What appears to be choice 
among thermal stimuli may instead be behavior 
elicited by them. In a thermal gradient these stimulus 
functions are confounded, Nevertheless, there are a 
few experiments which unambiguously demonstrate 
operant regulation. Iguanas learned to press a disc to 
escape from heat. The responses were independent of 
substrate temperature or heating rate, and appeared 
to depend solely on internal temperature (Kemp, 
1969). Goldfish can press levers and keep the tempera- 
tures of their aquariums between 33.5 and 36.5°C. In 
a thermal titration situation they pressed the lever 
both to decrease water temperature when it was too 
high and to prevent its rising above the desired levels 
(Rozin & Mayer, 1961). 


What Controls the Regulation? 


Thermoregulatory behavior in ectotherms is con- 
trolled by a combination of brain and other body 
temperatures, just as it is in mammals and _ birds. 
When the brainstem was heated to 41°C, lizards 
exited from the hot side of a shuttle box at colonic 
temperatures | to 2°C lower than normal. When the 
brain was cooled to 25°C, the lizards exited at colonic 
temperatures 1 to 2°C higher than normal (Hammel, 
Caldwell, & Abrams, 1967). Arctic sculpins were placed 
in warm water from which they could escape by swim- 
ming back to water at 5°C to which they had been 
adapted. Heating the forebrain lessened the time 
spent in warm water, while cooling the brain some- 
times suppressed the escape response (Hammel, 
Stromme, & Myhre, 1969). These results have been re- 
peated in several species of fish, from both warm and 
cool waters, and in every case altering brainstem 
temperature affected the tank temperature the fish 
selected (Crawshaw & Hammel, 1971, 1973, 1974). In 
frogs, abdominal (Cabanac & Jeddi, 1971) or spinal 
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cord (Duclaux, Fantino, & Cabanac, 1973) heating 
caused the animals to move toward colder water, indi- 
cating that amphibians also are responsive to changes 
in both internal and skin temperature.‘ 

If fish, amphibians, and reptiles prefer some 
temperatures to others, and these preferences can be 
altered by thermally stimulating the brain, there must 
be temperature-sensitive neurons in the brain. Both 
cold- and warm-sensitive cells have been found in the 
diencephalon of Australian lizards (Cabanac, Ham- 
mel, & Hardy, 1967) and brook trout (Greer & Gard- 
ner, 1970). 

In summary, ectothermic temperature regulation is 
determined by a combination of skin, brain, and other 
body temperatures, just as it is in mammals and birds. 
‘These conclusions could only have been drawn on the 
basis of behavioral experiments, because nonreflexive 
behavior is the sole or predominant means of thermo- 
regulation in these organisms. 


Thermoregulation in Infants 


Many newborn mammals and birds have great diffi- 
culty maintaining their body temperatures in the 
cold. Because they are so small, they have a large 
surface-to-volume ratio and they lose heat very rap- 
idly. Under natural conditions there are a variety of 
solutions to this problem—staying in a nest, bassinet, 
or marsupial pouch, clinging to the mother, or hud- 
dling with siblings (Dawes, 1968). However, if such 
a newborn is unfortunate enough to stray from the 
mother or nest, it will die at air temperatures that 
would not bother an adult. Is there a regulated body 
temperature even in newborns, and is their problem 
simply that they do not have the mechanisms to main- 
tain it? Or is a temperature control system lacking at 
birth, developing only later in life? We can answer 
this question by providing behavioral opportunities 
for temperature selection. 

Piglets less than 1 day old chose thermal environ- 
ments that allowed them to maintain their body tem- 
peratures within .03°C of what it was when they were 


4In these experiments, large deviations from normal brain 
temperature (at least 5°C) produced relatively small change in 
the deep body temperature threshold at which behavioral 
responses appeared (only 1—2°C). Similar effects have been seen 
in mammals. Corbit (1970) reported a number of experiments on 
rats in which he examined the effects of changes in hypothalamic 
temperature on lever pressing for convective cooling. The 
hypothalamic temperature threshold for the behavioral heat loss 
response was very high (40.3-43°C). However, reflexive heat loss 
responses (which lizards lack) were activated at the much lower 
brain temperatures of 38°C. It may simply be that the thresholds 
for activating behavioral and reflexive thermoregulatory re- 
sponses are very different, although see Corbit (1970) for alter- 
native explanations. 
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Fig. 6. Response of 1-day-old laboratory mice taken from the 
mother and placed together at a moderately warm position in 
the temperature gradient. All animals initially faced down the 
gradient. (From Ogilvie & Stinson, 1966. Reprinted by permission 
of the National Research Council of Canada.) 


with the sow (Mount, 1963), This indicatés a very 
good behavioral thermoregulatory system. Neonatal 
hamster pups were also very sensitive to thermal 
gradients and moved quickly away from cooler areas 
toward a heat source, where they became quiescent. 
However, if they were placed directly under the heat 
source, they did not move away and died from over- 
heating (Leonard, 1974). Similarly, 1-day-old mice 
moved from a cooler to 4 warmer environment. Figure 
6 shows the behavior of six newborn mice placed in a 
thermocline. Within 2 hr they were all in positions at 
substrate temperatures at least 5°C higher than at 
their initial positions (Ogilvie & Stinson, 1966), 
Neonatal puppies and rabbits also show thermo- 
regulatory behavior. Puppies aged 12 hr to 10 days 
were placed in the presence of two artificial mothers 
constructed of metal coils, one of which was covered 
with fur. In a warm environment (30°C) the puppies 
preferred the furred coil (Figure 7a). However, when 
the metallic surrogate was warmed and the fur mother 
cooled, the puppies spent almost 100% of the time 
with the metallic mother (Figure 7b; Jeddi, 1970). 
Rabbits oriented toward a furred artificial mother in 
a cold environment but not in a warm one. This be- 
havior was very efficient in regaining normal body 
temperature, which had dropped precipitously before 


Fig, 7, 4A: In a neutral environment puppics prefer the fur 


mother and avoid the metallic one. B: The fur mother has 
been cooled to 14°C and the matallic mother warmed to 33°C. 
The puppy was fed before the beginning of the test. (From 
Jeddi, 1970.) 


contact was established (Jeddi, 1971), Thus the search 
for contact comfort in very young animals may have a 
thermoregulatory component. In fact, Harlow (1971, 
p. 70) reports that infant macaques given a shoicg be- 
tween warm wire surrogate mothers and cool cloth 
surrogates showed a preference for the warm surrogate 
during the first 20 days of life. 

Generally, young mammals select temperatures 
higher than those chosen by adults. As physiological 
and hormonal capabilities develop and physical char- 
acteristics change (fur appears and surface-to-mass 
ratio declines), selected temperatures become lower. 

Young birds also thermoregulate behaviorally. 
Hogan (1974) observed that when the hen did not 
initiate brooding (which warms the chicks), 3- to 8- 
day-old chicks became cool and stimulated the hen to 
sit by rubbing against her and pecking her feathers. 
In an experimental situation, two breeds of 1- and 2- 
day-old chicks quickly learned to peck a key when that 
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Fig. 8. Control diagram of the relation between set point (or reference input), actual 
body temperature, and a reflexive response. The comparators (circles) are mixing points. 
Whenever the combination of pluses and minuses do not cancel one another, an error 
signal is generated. When this occurs (that is, whenever there is a disturbance such that 
heat gain and heat loss mechanisms are not at minimum levels), a response is activated 
which alters the regulated body temperature. Information from temperature receptors 
is then fed back to the comparator and the error signal is adjusted. Several points must 
be clarified: (1) There is no single regulated body temperature. That term is a convenient 
fiction for some mathematical combination of all the temperatures that contribute to 
effector output (Brown & Brengelmann, 1970). (2) The reference input variables leading 
into the set point indicate that the set point is not constant but fluctuates because of 
the influence of a variety of nonthermal inputs, (3) This diagram is not sufficient to 
describe the control of operant responses. For that, additional loops feeding back to 
the response controller are required for both response effectiveness and response cost (Van 
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Sommers, 1972). 


response was reinforced with heat and light (Zolman, 
1968). 

Is the thermoregulatory behavior of neonatal mam- 
mals and birds under the control of operant con- 
tingencies? The interpretation of existing behavioral 
experiments remains ambiguous. Of the experiments 
discussed above, only the behavior of the young chicks 
that pecked a key for heat is clearly under the control 
of an operant contingency, and even this may have a 
large respondent component (Wasserman, 1973). 

In summary, many neonatal animals that do not 
possess reflexive mechanisms sufhcient for maintaining 
body temperatures nevertheless demonstrate thermo- 
regulatory capabilities when provided with behavioral 
opportunities to do so. In this respect, young mam- 
mals and birds are much like ectotherms. 


THERMOREGULATION AND THE 
CONCEPT OF SET POINT 


So far we have been describing how animals lack- 
ing reflex mechanisms are nonetheless able to thermo- 
regulate behaviorally. Normally, of course, in mam- 
mals and birds, both reflexes and operants maintain 
a constant body temperature. Control theory provides 
a useful framework for describing this thermal inte- 
gration. A control system maintains its output (actual 
body temperature) at or near some reference value 
(thermal set point). If there is a discrepancy (error 
signal) between the set point and the achieved output, 


corrective measures (effector responses) which reduce 
the error are activated. If the system is working 
optimally, temperature is maintained as closely as 
possible to the set point (Figure 8; see Milsum, 1966, 
for a comprehensive discussion of biological control 
systems). Set point is the value of the input of a con- 
trol system at which the output is zero. It is neither a 
theory nor an explanation; it is merely a descriptive 
device which is useful in describing the operation of 
homeostatic systems. Clearly, there would be no set 
point without a nervous system, but for our purposes 
it does not matter where or how the set point is 
achieved. It is irrelevant whether the reference tem- 
perature is a function of the difference in firing rates 
between temperature-independent and temperature- 
sensitive neurons, or whether it is the point at which 
warm- and cool-sensitive neurons are minimally active 
(to enumerate just two of the ways it could be repre- 
sented). This does not imply that the same set point 
will be adequate for describing characteristics of the 
system at more fine-grained levels of analysis. Thermo- 
regulatory functioning may be described in terms of 
one, two, or many set points. The formulation we 
choose depends on the aspect of the system being 
studied. For example, for a physician concerned about 
a feverish patient, the deviation of the patient’s body 
temperature from normal is all he needs to know in 
order to decide whether or not to institute corrective 
measures. ‘he physician can operate as if there is a 
single set point, as illustrated in Figure 9a. We have 
already seen that although reflexive and operant 
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Fig. 9. Schematization of three possible controlling systems for 
thermoregulation, For clarity, feedback loops are not shown. 
The controller equation relating error signal and response 
determines both the threshold and the size of the response. 
Note that models (b) and (c) are equivalent to model (a) if we 
assume that the thresholds for elicitation of the various re- 
sponses need not be identical, (a) One central thermostat whose 
output activates all available operant and reflexive thermo- 
regulatory responses. (b) Two central thermostats, one con- 
trolling all reflexive, the other all operant responses. (c) Each 
thermoregulatory response is independent of any other. 


mechanisms normally work in concert, they are in fact 
functionally distinct. Therefore, the behavioral scien- 
tist must use a more detailed level of analysis. He can 
operate within the framework of Figure 9b. For the 
physiologist interested in a particular thermorcgula- 
tory response, such as vasomotor tone, the level of 
analysis must be finer still, allowing a detailed, 
quantitative analysis of the vasomotor controller. He 
might use the control diagram outlined in Figure 9c. 
For most purposes in this chapter, a single set point 
notion is sufficient to characterize general features of 
thermoregulation. 

The thermoregulatory set point is affected by a 
variety of internal and environmental variables and 
fluctuates from time to time. Because ectotherms have 
only nonreflexive regulatory mechanisms, we can see 
many of the determinants of the set point more clearly 
in them than in mammals. 
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Circadian Rhythms 


One might assume that diurnal reptiles become 
cold at night because they have no choice; they derive 
their heat primarily from the sun, so they necessarily 
cool down and become inactive when the sun sets. 
However, the decrease in body temperature which 
occurs at night is more than the passive result of 
cooler surroundings. Four species of lizards living in a 
thermocline for several days maintained a high body 
temperature when the lights were on, whether they 
were active or not, but they selected the cooler part of 
the gradient during the lights-off part of the cycle 
(Regal, 1967). Thus the lizards moved into ambient 
temperatures at which they became sluggish and 
uncoordinated. 

Because lizards cool down at night, they have prob- 
lems in the morning. If they were to remain under- 
ground until their burrows warmed up sufficiently to 
allow them to begin daily activity, they would lose a 
lot of valuable activity time. Fortunately for them, a 
warm burrow is not necessary for activity. In the 
laboratory, horned lizards emerged from their bur- 
rows before sunrise at body temperatures about 15°C 
below their normal activity levels. two groups of 
lizards were kept at constant conditions of 18 to 27°C, 
except for 8 hr a day when they had access to heat 
lamps, Both groups emerged from the sand at about 
the same time prior to the onset of the heat lamps, al- 
though, of course, each proup’s internal temperature 
was Close to that of its environment (Heath, 1962). 
Thus reptiles appear to show a circadian variation 
in their thermal set points. 

Mammals also show circadian fluctuations in body 
temperature, If we extrapolated from reptiles, we 
would say that these diurnal oscillations are rhythmic 
shifts in set point. However, peaks in body tempera- 
ture normally coincide with periods of activity, Are 
animals active because their temperatures are high, or 
are their temperatures high because they are active? 
One way of answering this question would be to 
measure all reflexive thermoregulatory responses 
(metabolic rate, vasomotion, etc.). If different body 
temperatures are observed, and yet the reflexive 
thermoresulatory responses do not vary so as to 
change them, then one could assume a shift in set 
point (Hensel, 1974). Another, less cumbersome way 
would be to use operant behavior to determine 
thermal preferences. If there are rhythmic shifts in set 
point, preferred temperature should reflect them. 


Hormonal State 


Regulated temperature varies with reproductive 
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condition. Pregnant blue spiny lizards regulated at 
lower temperatures than did males or postparturient 
females. They tolerated lower minimum internal 
temperatures and stopped basking at lower makxti- 
mums (Garrick, 1974). Plasma progesterone levels 
were twice as high in pregnant lizards of this species 
as in postparturients (Callard, Chan, & Potts, 1972), 
which suggests that this hormone may be involved in 
the change in behayior. When Garrick injected 
progesterone into postparturient lizards, it lowered 
the temperature around which the lizards regulated. 
Progesterone caused hyperthermia in female rats 
(Freeman, Crissman, Louw, Butcher, & Inskeep, 1970) 
and in humans (Gordon, 1972). This appears to be an 
upward shift in set point, because human females pre- 
ferred warmer stimuli during the luteal phase of the 
menstrual cycle, when progesterone levels are high, 
than during thé follicular phase, when levels are low 
(Cunningham & Gabanac, 1971). Of course, thermo- 
regulatory changes during the menstrual cycle may 
not be caused by progesterone per sc. Levels of other 
gonadal hormones, particularly estrogen, wax and 
wane simultaneously, and nongonadal hormones affect 
temperature as well. Regardless of which hormones 
are involved, however, it is clear that the set point 
changes with reproductive condition in lizards and 
mammals, although in opposite directions. 
Thermoregulatory effector mechanisms serve non- 
thermoregulatory functions as well. Interpreting be- 
havior changes is difficult, because nonthermal 
factors may produce such changes. For example, Mc- 
Lean and Coleman (1971) noted that female rats 
housed in large cages showed less of a drop in body 
temperature during estrus than did restricted rats. 
They concluded that the increased activity commonly 
seen in estrual rats was a response to a lower body 
temperature. This conclusion is not warranted. In- 
creased activity may be a sign of sexual agitation, and 
sexual responses may take precedence over thermo- 
regulatory responses during estrus. If the activity were 
truly a response to a lower body temperature, one 
would expect to see increases in other heat-producing 
and heat-conserving behaviors at the same time, but 
food intake and nest building actually decrease during 
estrus (Wade, 1972). *S 


Food Intake 


Digestion is another activity which affects thermo- 
regulation. Regal (1966) noted that certain reptiles, 
active at low temperatures, moved toward the warmer 
parts of their terraria after they had eaten. He 
measured the body temperature of a boa constrictor 
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by sewing a thermocouple into one of the mice that 
the snake ate. The snake moved so as to keep warm 
that part of the body containing the bolus. As soon as 
the snake defecated, basking abruptly ended. ‘Thus in 
cool-active reptiles, digestive requirements determine 
thermoregulatory activity. 

Reptiles need more heat after eating so that they 
can digest their meals. Mammals need less external 
heat after eating because of increased metabolic heat 
production. Rats bar-pressed more for heat when they 
were fed immediately after a 1-hr test session in the 
cold than when they were fed before the session 
(Hamilton & Sheriff, 1959). Internal temperatures 
were elevated after feeding and did not return to pre- 
feeding base lines for an average of 2 hr (Grossman & 
Rechtschaften, 1967). Decreased rates of pressing for 
heat following a meal thus help maintain a constant 
thermal balance. Varying the quality of the diet also 
affects response rate for heat, Rats fed a high-fat or 
high-carbohydrate diet gained weight, and worked less 
for heat than they did when they were fed a high- 
protein or powdered chow dict (Hamilton, 1963). 
Conversely, rats fed a high-fat diet or made hyper- 
phagic by hypothalamic lesions worked more than did 
controls to escape heat (Lipton, 1969). The increases 
in responding were related to higher body weight. 
These changes in response rate for both heat escape 
and heat reinforcement can be interpreted as be- 
havioral compensation for the lessened ability to lose 
heat reflexively (fat animals have more insulation). 
There does not appear to be any shift in thermal 
set point. 


THERMAL PREFERENCE 


Control theory is a way of systematically describ- 
ing how preferences for thermal stimuli vary under 
different conditions. Stimuli that decrease the devia- 
tion from set point are desired, whereas those that in- 
crease the error signal are aversive. Factors such as 
fever, circadian rhythms, and hormonal changes which 
shift the set point do not change the basic relationship 
between error signal and stimulus preference, ‘Thus 
the same thermal stimulus may be positively reinforc- 
ing in one condition and aversive in another. ‘The 
value of a given stimulus is determined by the context 
within which it is applied. 


Preference and Subjective Pleasure 


Preferences for thermal stimuli are related to re- 
ports of subjective comfort. Just as people can judge 
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brightness, loudness, and warmth, so can they judge 
the pleasure a particular stimulus provides. Pleasure 
judgments are assessments of affective quality, whereas 
preference is a description of choice behavior. Noth- 
ing in their respective definitions demands that the 
two measures be associated. Nevertheless, it is prob- 
ably reasonable to assume that stimuli judged as 
pleasant would also be desirable in preference tests, 
while those judged as unpleasant would be aversive. 

Cabanac (1971) has beautifully demonstrated how 
judgments of pleasure depend on context. Each of his 
subjects was immersed up to his chin in a tub of water 
whose temperature was controlled by the experimen- 
ter. With heat loss controlled this way, the subject’s 
core temperature could be maintained above or below 
his set point. (Note that this procedure does not 
change the set point, but instead increases the error 
signal by altering the regulated variable—actual body 
temperature.) While sitting in the bath, each subject 
dipped his left hand for 30 sec into a container filled 
with water at a particular temperature and judged 
the pleasure provided by this thermal stimulus. Then 
the subject put his hand back into the bath until the 
sensation disappeared, then dipped his hand once 
more into the container, which was now filled with 
water of yet a different temperature, and judged it. A 
series Of such ratings is shown in Figure 10a. The 
subjects perceived stimuli at the extreme ends of the 
scale as pleasant or unpleasant depending upon 
whether they were hypothermic or hyperthermic. 
Thus when they were hypothermic (internal tempera- 
ture below 37°C) they reported that warm or hot 
stimuli to the hand were pleasant. When they were 
hyperthermic (internal temperature above 37°C) 
they perceived cool or cold hand stimuli as pleasant. 
The change in hand temperature was not sufficient to 
change deep body temperature. However, in every 
case, the pleasant stimulus was one that, had it been 
extended over the entire body, would have decreased 
the difference between the set point and the actual 
body temperature. 

Similar results were obtained in an experiment in 
which the dependent variable was a measure of pref- 
erence (Cabanac, Massonnet, & Belaiche, 1972). Sub- 
jects sitting in a water bath had to manipulate a valve 
to change the temperature of another bath in which 
an arm was immersed. Results paralleled those from 
the experiments using judgments of thermal pleasure. 
Thermal preference systematically changed when the 
temperature of the bath in which the subjects sat was 
varied. 

In the experiments discussed above, the set point 
remained constant while body temperature was al- 


163 


very pleasant +2 


Sr Sees e AAMAG 
pleasant +1 e ee 
neutral 0 04 r) 
unpleasant - 4 eo oo fm e 0 ® 


very unpleasant -2 


(a) STIMULI °C 


THERMAL SENSATION 


very pleasant +2 


pleasant +1 ° 


e 
neutral a) ° 


unpléasant -4 ° ° 


vary unpleasant 2 a es 


19 <0 3 JU 
STIMULI 9 


(by Fever eHyperthermice Bathao°s 1:38.539C 


Fig. 10. (a) Judgments of thermal plaacura piven by 4 single 
subject when hypothermic (open symbols) and hyperthermic 
(closed symbols), (b) Similar judgments made when the subject 
was feverish (open circles) or hyperthermic (closed circles), The 
bath temperature was 36°C. The subject's internal temperature 
was 38.5-39.0°C. (From Cabanac, 1969.) 


tered. ‘he error signal can also be manipulated by 
changing the set point while maintaining a constant 
body temperature. Fever can be described as an up- 
ward shift in set point (see page 166). Cabanac (1969) 
tested a subject when he had a fever (because of in- 
fluenza) and when he was well. The temperature of 
the water bath, and hence the subject’s internal 
temperature, was identical in both series of tests. 
Judgments of pleasure, however, were very different 
(Figure 10b). ‘The subject liked warmer stimuli when 
he had a fever and cooler stimuli when he was well 
but hyperthermic. These experiments were later re- 
peated in four other subjects (Cabanac & Massonnet, 
1974), and the results strongly support the theory that 
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reported pleasure is determined by deviations from 
set point. They further imply that when body tempera- 
ture is abnormal, preference measures can determine 
whether or not the abnormal level is caused by a 
change in set point. We shall develop this idea more 
fully when we discuss drug administration. 

Cabanac’s experiments demonstrate that the same 
peripheral stimulus is perceived as pleasant or nox- 
ious depending upon the person’s internal state. 
Placing a subject in a temperature-controlled bath 
affects peripheral, core, and brain temperatures. In- 
formation from all three normally covaries, Are these 
sionals functionally interchangeable, or can organisms 
distinguish among them? When Corbit and Ernits 
(1974) warmed the hypothalamus, rats pressed a bar 
that cooled their hypothalamus rather than a bar 
that lowered the air temperature. When the skin was 
warmed, the opposite preference appeared. When the 
animal has no choice (e.g., when its hypothalamus is 
being warmed and all it has available to it is a bar 
that cools the air), it will use any opportunity which 
is available to decrease the error signal. However, the 
animal can detect the site of the disturbance and, 
given the choice, will direct its behavior toward chang- 
ing the temperature at that site. 


Nonthermal Determinants of Thermal Choice 


Operant selection of thermal reinforcers depends 
upon what other sorts of reinforcers are concurrently 
available. One may forego the opportunity to thermo- 
regulate eficiently in order to engage in a more highly 
preferred activity, as when an avid football fan sits 
shivering in the cold to watch an exciting game. 

Carlisle and Snyder (1970) demonstrated this effect 
very dramatically. Rats bar-pressed for heat and main- 
tained their body temperatures very well in the cold. 
However, when a lever press which produced elec- 
trical stimulation in the posterior hypothalamus was 
concurrently available, the rats worked for the brain 
stimulation exclusively, allowing their body tempera- 
tures to fall to the point of death. In another experi- 
ment (Weiss & Laties, 1963), injections of d-ampheta- 
mine increased the rate of heat-reinforced bar pressing 
of rats in the cold. The opposite result, a decrease in 
rate, was produced when a food-reinforced fixed-ratio 
schedule was concurrently available (Laties, 1971). 
Thus effects of thermal reinforcement (or any type of 
reinforcement, for that matter) must be considered in 
a broad context which includes other available rein- 
forcers. 

Thermoregulatory behavior also depends on the 
amount of effort involved in making a response. Mon- 
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keys pulled a chain that warmed their chamber. When 
the force requirement was increased, the monkeys 
tolerated wider air temperature fluctuations and their 
interresponse times lengthened. Eventually they 
stopped working completely and just sat and shivered 
(Adair & Wright, 1976). Response cost also deter- 
mines behavioral thermoregulation in the tropical 
lizard Anolis cristellus (Huey, 1974). In an open park, 
where basking sites were readily available, the lizards 
kept their body temperatures within fairly narrow 
limits. In an adjacent forest where shuttling from 
shade to sun required much more movement, the 
animals tolerated lower and more variable body 
temperatures. 


OPERANT CONTINGENCIES IN 
THERMAL HOMEOSTASIS 


‘Thermoregulation differs in important ways from 
other regulatory systems. It should come as no sur- 
prise, then, that there are substantial differences be- 
tween the effects of thermal reinforcers and the effects 
of more traditional reinforcers such as food and 
water. For instance, one feature of thermally rein- 
forced behavior that may seem curious is that response 
rate varies inversely with magnitude and duration of 
heat reinforcement (Carlisle, 1966: Weiss & Laties, 
1961). ‘This would be expected if we consider how the 
reinforcement is tied to thermal homeostasis. The an- 
imals do not respond so as to produce a maximal 
amount of heat; rather, they produce an optimal 
amount. In most experiments with heat reinforce- 
ment, the animal can reach “satiation” (i.e., set point) 
within a single session or even within a fraction of a 
session. Because the animal stops responding whenever 
set point is reached, an inverse relation between rein- 
forcement rate and response rate is to be expected 
under these conditions. 

Although animals perform very well under inter- 
mittent schedules with ingestive reinforcers, this is not 
the case with thermal reinforcers. Carlisle (1969) had 
dithculty obtaining stable fixed-ratio (FR) responding 
in rats with schedules as low as FR 5 or FR 10. How- 
ever, when bar pressing was reinforced intermittently 
by access to heat on a continuous reinforcement (CRF) 
schedule with a second bar (an FR-CRF chained 
schedule), good performance was obtained with sched- 
ules as high as FR 128 (Carlisle, 1970). Pliskoff, Wright, 
and Hawkins (1965) obtained similar results with 
rewarding brain stimulation, another case in which 
performance on intermittent schedules is often quite 
poor. ‘Thus thermal reinforcers resemble brain stimu- 
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lation more than they resemble ingestive reinforcers. 
This is not surprising considering the similarities be- 
tween the two. Both are direct and prompt, no con- 
summatory response is made, and neither can be 
stored the way food and water can. 


Avoidance of Thermal Change 


People anticipate temperature changes. Someone 
who goes outside on a bitterly cold day equips him- 
self with coat, hat, and gloves before leaving the 
warmth of a heated building, or, knowing that it is 
very hot outside, chooses not to leave his air-condi- 
tioned home. Can animals similarly anticipate and 
avoid thermal change? 

Rats can avoid heat. Matthews, Morin, and Church 
(1971) placed animals in a temperature-controlled 
chamber and programmed a version of a Sidman 
(1953) avoidance schedule. A 5-sec exposure to heat 
occurred every 5 sec if a bar was not pressed. Each 
response delayed the onset of the next heat exposure 
for 15 sec. When ambient temperature was manipu- 
lated, behavior which avoided very hot air occurred 
at different rates. Response rates were higher in the 
hotter environments. Thermal avoidance thus de- 
pends on the relationship between the ambient tem- 
perature and the thermal effects of responding rather 
than on the absolute values of the stimuli. This is 
similar to the context dependencies we noted previ- 
ously in Cabanac’s work. 

Analyses of avoidance learning that emphasize 
species-specific defense reactions (Bolles, 1970) have 
been based almost entirely on data collected with elec- 
tric shock as the aversive stimulus. By using heat as an 
aversive stimulus, one can study avoidance learning in 
a situation where reactions other than those characs 
teristically produced by shock may be prepotent. 
Overheated rats are initially active, but later spread 
out and sprawl (Roberts, Mooney, & Martin, 1974). 
‘That rats learn to bar-press to avoid and to cscape 
heat suggests they can learn responses which are quite 
different from their unconditioned reactions to the 
aversive stimulus. 


THE OPERANT AS A MEASURE OF SET 
POINT AFTER DRUG ADMINISTRATION 


Operant methods are extremely useful in analyzing 
drug effects on body temperature. If all we know 
about a drug is that it changes body temperature, we 
cannot assign it any particular role in thermoregula- 
tion. ‘This problem arises because there are a number 
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of ways in which drugs alter body temperature. For 
instance, many general anesthetics depress central 
nervous system function, including most thermo- 
regulatory reflexes (Lomax, 1970), and body tempera- 
ture then varies with ambient temperature.5 Drugs 
like amphetamine generally stimulate behavior, mak- 
ing an animal active and emotional, and body temper- 
ature may passively rise. Other drugs act directly on 
effector mechanisms involved in thermoregulation. For 
example, cholinesterase inhibitors cause profuse sweat- 
ing and salivation (Koelle, 1970) and so lower body 
temperature. Adrenergic blocking agents may pro- 
duce the same outcome by causing peripheral vaso- 
dilation (Nickerson, 1970). Some compounds shift the 
set point. Pyrogens, for example, displace it upward. 
This phenomenon is called fever. 

What we want to know about any drug that affects 
body temperature is whether it changes the set point 
or merely alters effector activity. One of the easiest 
ways to determine whether a body temperature change 
represents a change in set point is to measure the 
thermally reinforced responses that accompany it. Op- 
crant behavior reflects the error signal—the differ- 
ence between the set point and the achieved tempera- 
ture. Because of this we can make the following two 
assertions: 


1. If an animal’s body temperature changes to a new 
level, and if the animal selectively performs oper- 
ants which defend the new level against deviations 
in cither direction, then the temperature change 
represents a set point displacement. 


¢. If an animal's body temperature changes to a new 
level, and if the animal selectively performs oper- 
ants which counteract the change, then the temper- 


ature change represents something ether than a 
shift in set point, 


To illustrate these points, let us consider the effects 
of raising the body temperature in two different ways 
in a situation in which an animal has the opportunity 
to alter its thermal environment: 


I. Make the animal febrile by injecting a pyrogen, If 
pyrogens raise the set point, the animal will re- 
spond for warmth, but not for cooling, until its 
body temperature approaches the new, elevated 
set point. A commonplace example of this is that 
people in the first stages of a fever report feeling 
cold and try to warm themselves. 


5In the days before air conditioning a serious problem with 
surgery on hot summer days was keeping the patient’s tempera- 
ture down. Now, with operating rooms maintained at about 
24°C, the problem is to keep it up at normal levels. 
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2. Make the animal hyperthermic by placing it in a 
hot environment. In this case, body temperature 
rises not because the set point is elevated, but be- 
cause reflexive heat loss mechanisms are inadequate 
to the task of removing excess body heat. ‘The ani- 
mal should work to cool itself, whereas it should 
not work for warmth. 


Compounds That Alter Set Point 


Pyrogens cause fever. The current concept of fever 
is that injection or natural entry of bacterial pyrogens 
into the body causes the release of an endogenous 
pyrogenic substance from blood leucocytes, ‘This leu- 
cocytic pyrogen then travels to the brain and acts on 
thermosensitive cells, raising the set point. This leads 
to increased heat production and decreased heat loss 
until body temperature rises to the new set level. 
Fever was accompanied by heat-producing instru- 
mental responding in baboons (Gale, Mathews, & 
Young, 1970), cats (Weiss, Laties, & Weiss, 1967), and 
dogs (Cabanac, Duclaux, & Gillet, 1970). Febrile dogs, 
for example, worked more for heat in the cold and 
less for cool air in hot environments than did normal 
animals (Figure 11). 

The height of a fever is determined to some extent 
by the ambient temperature; it is lower in the cold 
(Fekety, 1963; Weiss et al., 1967). Does this imply that 
fever represents an alteration in the sensitivity of 
peripheral cold receptors? Operant measures suggest 
a different interpretation. In the Weiss et al. expert- 
ments, the cats were also tested in a chamber which 
had a heat-producing lever, When the operant was 
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Fig. 11. Operant responding for infrared heat and for cool air 
in a dog with and without a fever. Responding varies systema- 
tically with ambient temperature. When feverish, the dog 
responds more for heat and less for cool air than when not 
feverish. (From Cabanac et al., 1970.) 
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available, the fever of cats in the cold was substan- 
tially higher than when it was not available, and in 
fact was very close to the levels attained in a neutral 
environment. It thus appears that reflexive mech- 
anisms are simply inadequate to accomplish the full 
rise in temperature when the environment is cold. 

Can reptiles develop fevers? Despite an almost com- 
plete lack of reflexive thermoregulatory mechanisms, 
ectotherms, as we have seen, do indeed regulate their 
temperatures when they have the behavioral oppor- 
tunity to do so. 

In an ingenious experiment, Bernheim, Vaughn, 
and Kluger (1974) allowed iguanas to adjust their 
body temperatures by shuttling between cold and 
warm chambers. ‘Then they injected the iguanas with 
a pyrogen (one that caused fever in rabbits). Follow- 
ing the injection the iguanas spent more time in the 
warmer side of the chamber and developed an average 
fever of 2°C. Another group of lizards were given the 
same dose of pyrogen but kept in a constant ambient 
temperature (below the febrile level). There was no 
change in body temperature, indicating that the fever 
was produced solely behaviorally. 

Other sorts of data can be combined with operant 
measures to assess changes in set point. Clark and 
Coldwell (1973) reported that after intraventricular 
injection of tetrodotoxin—the puffer fish poison—cats 
had lowered body temperatures even though regula- 
tory mechanisms appeared to be intact. While the 
animals were recovering but still hypothermic, they 
were exposed to infrared heat lamps, which raised 
their temperature sharply. When the lamps were 
turned off, their body temperature returned to where 
it would have been had the heat load not been im- 
posed, The cats did not shiver while they were recov- 
ering from the hypothermia caused by the poison, but 
when body temperature was further lowered with ice 
packs, they did shiver. This is good evidence that 
tetrodotoxin lowers the thermoregulatory set point. 
Clark and Lipton (1974) then showed that patterns of 
instrumental responding for thermal reinforcement 
were compatible with this interpretation. After tetro- 
dotoxin, while body temperature was falling rapidly, 
the cats increased lever pressing to escape heat in a 
warm environment and decreased pressing for heat in 
the cold. Declines in body temperature were very 
similar whether the cats were in the cold without a 
bar, or whether they were responding for heat rein- 
forcement or escaping from heat. The lowered body 
temperature represents a downward shift in set point. 

Many drugs lower body temperature, but their 
mechanisms of action may be dissimilar. Behavior 
differentiates among them. To illustrate this, we shall 
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consider three compounds, all of which, when injected 
systemically, lower body temperature in rats, but 
which have very different effects on operant respond- 
ing: chlorpromazine, quinine, and sodium salicylate. 
Chlorpromazine depresses operant behavior in a dose- 
related manner regardless of whether the response pro- 
duces heat in the cold (Weiss & Laties, 1963) or 
reduces heat in a warm environment (Polk & Lipton, 
1975). Thus the hypothermic effect of chlorpromazine 
is probably not due to its specific action on tempera- 
ture pathways, but rather is a side effect of the general 
central depressant properties of this drug. Quinine, on 
the other hand, increases lever pressing for heat in the 
cold (Satinoff, unpublished research). This is not a 
general excitation but rather a specific thermoregula- 
tory effect; in food-deprived rats, responding for food 
decreases after quinine injection (Figure 12). Behav- 
ior compensated for the fall in body temperature, 
suggesting that quinine was simply acting on one or 
several effector mechanisms. In fact, when we mea- 
sured the physiological responses of quinine-treated 
rats in the cold, we found that all reflexive responses 
were normal except shivering, which was greatly re- 
duced. Sodium salicylate has a quite different effect— 
the rats increased responding to escape heat even 
while their body temperature was dropping (Polk & 
Lipton, 1975). Thus there was a coordinated change 
in bedy temperature and instrumental behavior. 

In summary, all of these compounds lower body 
temperature, but chlorpromazine depresses operant 
behavior regardless of whether it is heat-preducing or 
heat-reducing. quinine causes a compensatory change 
in behavior, and sodium salicylate elicits parallel be- 
havioral and physiological responses, 

Of these, the results with sodium salicylate are 
potentially the most interesting from our point of 
view, Lhey imply that this drug is shifting the thermo- 
regulatory set point downward. One must be extremely 
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Fig. 12. Cumulative record showing response rate for heat and 
food of one rat before and after injection of quinine HCl (50 
mg/kg) and saline. The pen resets to base line every 15 min. 
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cautious in making this interpretation because the 
increase in responding may represent a general activa- 
tion. If salicylate is in fact lowering the set point, not 
only should responding to escape heat increase, but 
responding for heat should decrease. 


Neurochemical Basis of Thermoregulation 


In recent years one of the major efforts in the study 
of temperature regulation has been to identify the 
neurotransmitters released at the synapses of neurons 
in thermoregulatory pathways. The protocol in many 
of these studies is simple: inject the transmitter and 
measure pre- and postinjection body temperatures. 
From these sorts of data pharmacological models of 
heat loss and heat production are constructed. 

Not surprisingly, pharmacological models abound. 
The data on which they are based vary with the 
species tested, the dose of drug used, and the route of 
injection. We shall first list some of the conflicting 
results in this field to give an idea of the magnitude 
of the problem, and then we shall demonstrate why 
opérant measures can provide the right serts of data 
en which to build a pharmacological model. 

In the first experiments studying neurotransmitters 
and body temperature, Feldberg and Myers (1963) re- 
ported that intraventricular injections of nerepineph- 
rine (NE) abolished shivering and lowered pyrogen: 
induced fever in cats. Serotonin (5:HT), on the other 
hand, caused shivering and a vise in rectal tem pera 
ture in afebrile cats. When smaller deses were injected 
into the preoptic/anterior hypothalamus, the results 
were similar. Injections into other areas of the brain 
did not affect body temperature. On the basis of these 
results, Feldberg and Myers suggested that NF activates 
heat loss effectors (sweating, panting, vasodilation) and 
5-H'T activates heat-production pathways (shivering). 

This was the first neurochemical model of tempera- 
ture regulation, Similar results with dogs (Feldberg, 
Hellon, & Lotti, 1967) and monkeys (Myers & Yaksh, 
1969) led to the hope that this model might be 
applicable to all mammalian species. However, differ- 
ent results soon appeared in other species. With intra- 
ventricular administration, rabbits became hyperther- 
mic after NE and hypothermic after 5-HT (Cooper, 
Cranston, & Honour, 1965; Jacob & Peindaries, 1973). 
Mice responded to both of these amines with a drop 
in body temperature (Brittain & Handley, 1967; 
Handley & Spencer, 1972). Rats became hypothermic 
after 5-HT (Feldberg & Lotti, 1967; Myers & Yaksh, 
1968), but it was found that NE can lead to either a 
fall in temperature (Bruinvels, 1970, 1973; Satinoff & 
Cantor, 1975), a rise (Myers & Yaksh, 1963), or a fall 
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followed by a rise (Feldberg & Lotti, 1967). To com- 
pound the problem, intrahypothalamic and _ intra- 
ventricular injections sometimes lead to opposite 
effects. Furthermore, acetylcholine (ACh) has potent 
thermoactive properties which vary with the species 
(Brimblecombe, 1973). Histamine also affects body 
temperature (Brezenoff & Lomax, 1970), as does dopa- 
mine (Hansen & Whishaw, 1973; Yehuda & Wurtman, 
1972). As things stand now, no one can predict how a 
particular transmitter will affect thermoregulation in 
an untested species. 

Most of the experiments on transmitter substances, 
temperature regulation, and instrumental behavior 
have been done with rats. We shall use this work to 
illustrate how operant measures can clear up confu- 
sion about pharmacological mechanisms. Our analysis 
rests on the assertions made on page 165: when set 
point is displaced, animals will work to bring actual 
body temperature as close as possible to the new level; 
when body temperature is altered without a set point 
change, behavior will be compensatory. 


NOREPINEPHRINE 


In a neutral environment (25°C) preoptic in- 
jection of low doses of NE (.05—.15 yg) raised 
hypothalamic temperature. The same effect occurred 
in the cold, and the rats increased lever press- 
ing for heat. Both of these effects were monotonic 
functions of the concentration of NE (Beckman, 1970). 
The operant augmented the rise in internal tempera- 
ture, which suggests that the set point had been raised. 
(However, before this conclusion can be reached it 
must be shown that the increased responding for heat 
is a specific thermoregulatory effect and not just a 
general increase in activity.) If low deses of NE in the 
preoptic area shift the thermoregulatory set point up- 
ward, then we would expect that all reflexive re- 
sponses would be integrated so as to increase heat 
production and decrease heat loss. In fact, this is 
exactly what happens. The rise in temperature is 
caused by increased shivering, increased metabolic 
rate, and vasoconstriction (Satinofl & Hackett, 1975). 

Avery (1971; see also Avery & Penn, 1973) has re- 
ported quite different results. In room air (23°C), 
preoptic injection of high doses of NE (25 wg) lowered 
body temperature. Heat escape was the behavioral 
measure: as long as the bar was held down, the heat 
lamp was off and a cooling fan on. The rats held the 
bar down less postinjection; that is, they allowed the 
chamber to stay hot. ‘The body temperature change 
and the operant behavior moved in opposite direc- 
tions: body temperature dropped, yet the rats de- 


THERMOREGULATORY BEHAVIOR 


creased responding to escape heat. In this case behavior 
compensated for the lowering of internal temperature. 
The drug effect is therefore not a change in the set 
point.® We would not expect the fall in body tempera- 
ture to be the result of an integrated thermoregula- 
tory process. Although there were no measures of 
reflexive responses in these experiments, Satinoff and 
Cantor (1975) reported similar results after intraven- 
tricular injections of NE: body temperature fell and 
the rats compensated in the cold by increasing lever 
pressing for radiant heat. This was not a general ac- 
tivating effect of the NE, because when it was injected 
in a warm environment and body temperature fell, 
the rats did not increase responding to escape heat. 
When these authors later examined how the fall was 
brought about, they found that at normal room tem- 
perature it was caused by an immediate and intense 
peripheral vasodilation. Metabolic rate actually in- 
creased quickly thereafter as if to compensate for the 
fall (Cantor & Satinoff, 1976). Wherever intraventricu- 
lar NE is acting, it is not shifting the set point. 

We can tentatively conclude that, in rats, NE in- 
jected in the preoptic area shifts the set point upward 
by activating both operant and reflexive mechanisms 
leading to augmented heat production and decreased 
heat loss. Any hypothermic effects of NE, either after 
intraventricular injection or after injection of agents 
such as 6-hydroxydopamine, which causes release of 
endogenous NE (Breese, Moore, & Howard, 1972; 
Hansen & Whishaw, 1973; Nakamura & Thoenen, 
1971; Simmonds & Uretsky, 1970), are apparently 
caused by action either on a controller that activates 
effector pathways or on the effector pathways them- 
selves. We predict that the hypothermia seen after 
6-hydroxydopamine injections will be accompanied by 
increases in heat-reinforced behavior. 


ACETYLCHOLINE 


The same reasoning can be applied in interpreting 
the effects of other transmitter substances implicated 
in thermoregulation. ACh and other cholinomimetic 
agents lowered body temperature in rats when in- 
jected intrahypothalamically (Beckman & Carlisle, 
1969; Crawshaw, 1973; Kirkpatrick & Lomax, 1970). 


6 Similar considerations apply to the analysis of the effects of 
brain stimulation on body temperature. Electrical stimulation of 
the preoptic area while the rats were working for heat in the 
cold produced changes in body temperature but did not cause 
appropriate shifts in operant behavior (Crawshaw & Carlisle, 
1974). Thus although electrical stimulation produced changes in 
body temperature, it did not affect motivational aspects of 
thermoregulation, and therefore such changes in body tempera- 
ture need not be interpreted as changes in thermal set point. 
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ACh (50 pg) caused both a fall in brain temperature 
and a decrease in rate of working for heat in the cold 
(Beckman & Carlisle, 1969). Here the behavior and the 
physiological responses are complementary, and if it 
has no general depressant effect on operant respond- 
ing, we can assume that ACh lowers the set point. 

On the other hand, carbachol (a cholinomimetic 
drug) (8 pg) increased body temperature and also 
increased bar holding to escape heat (Avery & Penn, 
1973). In this case the behavior compensated for the 
change in body temperature, and therefore this dose 
of carbachol did not change the set point. Like high 
doses of NE, it might have acted directly on effector 
pathways (in this case to promote heat loss) or blocked 
the appropriate synapses linking the central controller 
to those pathways. ‘This latter possibility 1s likely be- 
cause atropine, a cholinergic blocking agent, led to a 
rise in body temperature in rats when injected into 
the preoptic area (Kirkpatrick & Lomax, 1967). 


SEROTONIN 


This indoleamine lowered body temperature in rats 
when injected intraventricularly (Bruinvels, 1970; 
Feldberg & Lotti, 1967; Myers & Yaksh, 1968). As with 
NE, intrahypothalamic injections of 5-HT had the 
opposite effect, but produced little change in the rate 
of bar pressing for heat in the cold (Crawshaw, 1972). 
Crawshaw concluded that 5-HT does not act in the 
preoptic area to shift the set point, but rather that its 
effect on body temperature is unspecific. Bruinvels 
(1970), on the basis of pharmacological data, also con- 
cluded that 5-HT does not act on thermosensitive 
receptors, but instead its hypothermic effect is caused 
by some unspecific action. 

From all of these experiments we can tentatively 
conclude that in rats, NE raises the set point, ACh 
lowers it, and 5-HT does not affect it. The number of 
experiments investigating the action of transmitter 
substances on both body temperature and operant 
behavior is still small, and even when such experi- 
ments become more numerous everything will not 
magically fall into place. Some of the species differ- 
ences may indeed be real. However, many of the con- 
flicting results about the action of transmitter sub- 
stances on body temperature could be cleared up if 
operant measures were used more often to specify 
whether the set point was being altered. It is fruitless 
to build neurochemical models of thermoregulation 
without knowing if a compound is really changing the 
set point. This can best be done by a careful analysis 
not only of physiological responses, but of operant 
responding as well. 
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Determinants of 


Reinforcement 


and Punishment* 


REPRODUCIBLE BEHAVIORAL PROCESSES 


The scientific study of behavior poses many diffi- 
culties. One difficulty results, paradoxically, from our 
familiarity with numerous isolated facts about the be- 
havior of ourselves, other people, and animals. The 
interpretations customarily given to these facts lead 
to preconceived opinions, which frequently interfere 
with the unbiased study of behavior. Moreover, be- 
havior is essentially dynamic in the sense that behav- 
ioral processes reflect changes in the interactions be- 
tween an individual and his environment which take 
place in time. Even the simplest relationships may not 
be readily apparent to casual observations at any 
moment. Finally, a pattern of behavior is the result of 
many interrelated factors, including environmental 
circumstances that have long since ceased to exist, 
thus posing special problems for identification and 
study. 
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in preparation of the manuscript and Drs. P. B. Dews, J. W. 
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In the early 1930s B. F. Skinner developed tech- 
niques for the experimental study of behavior. An es- 
sential feature in his approach to behavior was the 
emphasis on the rate of occurrence of some identi- 
fable “response” as a significant property of behavior. 
Techniques for studying reflexly elicited behavior had 
already been developed, but it is not possible to 
identify an eliciting stimulus for much of the be- 
havior of an individual that can be predicted and 
controlled. ‘Yo say that behavior occurs in the absence 
of an identifiable eliciting stimulus does not imply 
that the behavior is not determined, but simply that 
it does not have the functional properties of reflexly 
elicited behavior. For example, a food-deprived rat 
given access to a supply of small food pellets will eat 
for a period of time and then cease. If the rate of in- 
gesting pellets is recorded, a simple and reproducible 
curve of eating is obtained that describes the ingestion 
of food under these conditions (Skinner, 1930, 1938). 
Although the rate of eating following food depriva- 
tion is a reproducible temporal process, it is not pos- 
sible to analyze this behavior simply in terms of 
momentary eliciting stimuli. Because the presentation 
of food is often immediately followed by ingestion, it 
may seem that the ingestion of food is elicited by the 
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presence of food itself (presumably its sight and smell). 
But the rate of eating declines in time; the presence of 
food does not continue to have the same preemptive- 
ness. ‘I’hus it is necessary to invoke some other factors, 
such as habituation, fatigue, adaptation, or depriva- 
tion, operating in conjunction with the sight and smell 
of food. ‘The deprivation of food critically determines 
rate of eating and also changes other classes of re- 
sponses in a reproducible way, but it is not an elicit- 
ing stimulus in the sense in which the term is used in 
reflex physiology. 

The occurrence of emitted behavior generally bears 
a temporal relation to the deprivation and presenta- 
tion of particular environmental conditions, whether 
or not the deprivation produces any conspicuous 
physiological change. For example, if a rat is confined 
in a small space and then given access to a revolving 
wheel, it will run for some time and then gradually 
cease. As in the example of the rat eating, a record in 
time of the running behavior will reveal that the 
running has a characteristic temporal pattern. After 
the rat is deprived of access to the wheel, the avail- 
ability of the wheel is closely followed by running, 
yet it is not useful to regard the wheel as an eliciting 
stimulus for running. In studying the occurrence of 
such behavior in time, it is clearly desirable that par- 
ticular instances be easily identified, reproducible, 
and functionally significant. The criterion for specify- 
ing functionally significant emitted responses as op- 
erants will be taken up later in this section. 

Behavioral phenomena that have an identifiable 
temporal pattern under specified conditions and which 
are reproducible in different individuals may be de- 
scribed as reproducible behavioral processes (Zimmer- 
man, 1963). An understanding of such reproducible 
behavioral processes is to be found in the exact charac- 
terization of the temporal relations among the cyents 
comprising such processes and in the specification of 
the conditions under which they occur. The present 
chapter will discuss reinforcement and punishment in 
the context of reproducible behavioral processes. In 
many respects, reinforcement and punishment are 
analogous if not equivalent processes; therefore, con- 
siderations pertaining to the one will usually apply to 
the other. 

Future behavior is mainly determined by the conse- 
quences of past behavior. How behavior is changed 
by experience can be demonstrated as a reproducible 
behavioral process with a food-deprived rat in an 
apparatus containing a food dispenser and a lever 
projecting from the wall. Any response of the rat that 
depresses the lever will be followed by the presenta- 
tion and ingestion of food. Under such circumstances, 
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the likelihood that a similar response will occur again 
after the food is eaten is increased, and further re- 
sponses will occur with a characteristic temporal pat- 
terning. If behavior is so altered by the presentation 
of food, the conditions under which food presentation 
occurs—in this example, the depression of the lever— 
define the property with respect to which responses 
are called similar. Skinner (1937; 1938; 1953, p- 66) 
uses the term operant to describe this functionally 
identifiable class and calls the change in the frequency 
of the operant the process of operant conditioning. 
Because subsequent behavior is altered under these 
conditions, the food is said to be a reinforcer and pre- 
senting food in a specified relation to an operant is 
reinforcement. 

If only depressions of the lever exceeding a certain 
force are followed by food presentation (differen- 
tial reinforcement), weaker responses diminish and 
stronger responses become more frequent. Even 
stronger responses can be selected through further 
progressive differential reinforcement. It should be 
noted that merely specifying relations between re- 
sponses and consequent stimuli may not specify a 
functional class of responses that could be called op- 
erant. “No property is 4 valid defining property of 4 
class until its experimental reality has been demen- 
strated, and this rule excludes a great many terms 
commonly brought into the description of behavior” 
(Skinner, 1938, p. 41; 1969). Yet it is important to 
recognize that these broad principles do apply beyond 
experimental situations that can be precisely de- 
scribed. In his recent writings, Skinner has used the 
term contingencies of reinforeement to refer to the 
interrelations between antecedent behavior and con- 
sequent events that define operants (Skinner, 1969, 
pp. 7, 127). 

‘The examples above are important because basic 
concepts applicable to the formulation of behavior ag 
a scientific system were devcleped in this situation, It 
is Clear from Skinner's experimental reports that the 
basic data in the example of “conditioning” and in 
the earlier one of “changes in hunger’ (food depriva- 
tion) were the orderly changes in rate of responding. 


“Conditioning” . . . and “a change in hunger” 
differ as processes only with respect to the con- 
ditions under which they are observed. The 
thing changing (the observed aspect of behavior) 
is the same in both. (Skinner, 1932, p. 276) 


1 Skinner uses reinforcement to refer to the specifiable con- 
ditions in the environment that give rise to operant conditioning. 
In this chapter the single term reinforcement will be used to 
refer to both the orderly change in behavior and its environ- 
mental determinants. 
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The change in behavior under the specified condi- 
tions that identifies the food as a reinforcer and identi- 
fies lever pressing as an operant is a reproducible 
behavioral process. 

A distinction has been made between operations as 
experimental procedures that are imposed by the 
environment and processes as the behavioral effects of 
these procedures (Catania, 1973, p- 33; Ferster & Skin- 
ner, 1957, p. 730). Although reinforcement is often 
described as a relation or operation (the presentation 
of a reinforcer in a specified temporal relation to an 
operant), it is clear that the operation of reinforce- 
ment (or punishment) has a behavioral effect implicit 
in its meaning. Behavioral processes are best viewed 
as orderly changes in time and need not imply inter- 
vening mechanistic principles. The terms reinforce- 
ment and punishment are used here to refer to the 
reproducible changes in behavior resulting from the 
experience of the individual under certain specified 
conditions. The connotations of these terms include 
both a temporal sequence of behavior and the condi- 
tions under which this behavior occurs. 

Since the time of the first reports by Skinner, many 
orderly changes in rates of responding under other 
specified conditions have been described (see especially 
Ferster & Skinner, 1957; Skinner, 1938). Among the 
most important specifications are the schedules de- 
scribing the arrangements for initiating and termi- 
nating stimuli in time and in relation to specified 
responses. Such schedules engender changes in _be- 
havior with characteristic temporal properties and 
rates of responding that are consistent in different 
individuals. Hence it is appropriate to view schedule- 
controlled performances as reproducible behavioral 
processes.* ‘his conception differs from the usual ap- 
proach of analyzing schedule performances as special 
consequences of reinforcement (or punishment). 
While the latter approach has the advantage of limit- 
ing the number of basic behavioral processes to two 
fundamental cases, some schedule-controlled patterns 
are more sensitive dependent variables in revealing 
how behavior is modified by environmental conditions 
than the changes in level of responding that are 
usually used to define reinforcement or punishment. 
In particular, the effects of consequent stimuli can 
be greatly changed, depending upon how they are 
scheduled (see this chapter’s sections on disparate 


2 The characterization of different schedule performances as 
behavioral processes and the specifications of the conditions 
under which these performances occur are beyond the scope of 
this chapter (see Ferster & Skinner, 1957; Morse, 1966; Skinner, 
1966; Zeiler, chapter 8 in this volume). The present chapter 
emphasizes the importance of basing behavioral concepts on 
orderly reproducible changes in behavior. 
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effects of consequent events and response-produced 
shocks as consequent events maintaining behavior). 
Thus the status of consequent events defined as “rein- 
forcers” or “punishers” in one context may be changed 
when they are scheduled differently. This raises im- 
portant questions about the fundamental concepts 
applicable to a scientific formulation of behavior and 
about the generality of the concepts of “reinforcers” 
and “punishers.” The defining characteristics of rein- 
forcers and punishers do not encompass all the effects 
of such stimuli on behavior. How consequent events 
modify behavior is to be understood in both the 
development and the maintenance of subsequent be- 
havior. 

The description of reinforcement and punishment 
as reproducible behavioral processes differs from the 
usual description of these terms as operations (see 
especially Catania, 1968, 1969). A discussion of the 
differences is instructive in clarifying the precise 
usage of terms. In common usage the terms reinforcer 
and punisher are emphasized as basic terms, while 
reinforcement and punishment are defined as the 
presentation of a reinforcer or punisher in a specified 
temporal relation to an operant. The increased occur- 
rence of responses similar to one that immediately 
preceded some event identifies that event as a rein- 
forcer. A punisher is defined in an analogous way: the 
decreased occurrence of responses similar to one that 
immediately preceded some event identifies that event 
as a punisher. Reinforcers and punishers, as environ- 
mental “things,” appear to have a greater reality than 
orderly temporal changes in ongoing behavior. Such 
a view is deceptive. There is no concept that pre- 
dicts reliably when events will be reinforcers or 
punishers; the defining characteristics of reinforcers 
and punishers are how they change behavior. Events 
that increase or decrease the subsequent occurrence 
of one response may not modify other responses in the 
same way. The modification of behavior by a rein- 
forcer or by a punisher depends not only upon the 
occurrence of a certain kind of consequent environ- 
mental event but also upon the qualitative and quan- 
titative properties of the ongoing behavior preceding 
the event and upon the schedule under which the 
event is presented. 

In characterizing reinforcement as the presentation 
of a reinforcer contingent upon a response, the ten- 
dency is to emphasize the event and to ignore the im- 
portance of both the contingent relations and the 
antecedent and subsequent behavior. It is how they 
change behavior that defines the terms reinforcer and 
punisher; thus it is the orderly change in behavior 
that is the key to these definitions. It is not appro- 
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priate to presume that particular environmental 
events such as the presentation of food or electric 
shock are reinforcers or punishers until a change in 
rate of responding has occurred when the event is 
scheduled in relation to specified responses. There is 
little value in naming only one of the conditions 
necessary for the change. Identifying an event as a 
reinforcer or punisher independently of the conditions 
of use has limited predictive utility. On the other 
hand, prior identification of the suitable conditions 
that will result in the same behavioral process does 
provide generality. The purpose of the present chap- 
ter 1s tO give perspective to basic concepts used in the 
experimental study of behavior by discussing some of 
the determinants of reinforcement and punishment. 


THE CONTINU!ITY OF BEHAVIOR IN 
TIME (SHAPING) 


Because reinforcement and punishment are be- 
havioral processes occurring in time, they can only be 
understood in the temporal context of sequential 
interactions of behavior with the environment.? Em- 
phasizing “reinforcers” and “‘punishers’” as primary 
events neglects the importance of both antecedent and 
subsequent behavior, Operant behavior is determined 
mainly by the consequences of past behavior—not so 
much the particular consequences, but their sequence 
in time and in relation to the individual’s behavior. 
The scheduling of events is critically important. The 
outstanding characteristic of operant behavior is that 
it can be differentiated in form and in temporal pat- 
terning by consequent events. Conditioned operant 
behavior emerges from existing behavior through suc- 
cessive approximations to new and more complex 
forms of behavior by the process of successive differen- 
tial reinforcement (shaping). Behavior that has be- 
come highly differentiated can be understood and ac- 
counted for only in terms of the history under which 
the behavior was shaped by different consequences. 
This schedule gives the exact historical specification 
of the temporal and sequential relations between 
environmental events and behavior. 

One purpose of this chapter is to emphasize how 
present and future behavior depends upon the sequen- 
tial ordering of behavior. Because it is generally 
understood that behavior can be shaped by successive 


3 Perhaps shaping the suppression of behavior with punish- 
ment is not exactly analogous to shaping with reinforcement; 
little is known about the former. The work of Azrin (1960) 
clearly indicates that punishment depends upon sequential re- 
lations between behavior and the environment. 
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differential reinforcement, it is useful to consider the 
shaping of responses commonly studied in experi- 
mental situations. Knowing the appropriate condi- 
tions for using a reinforcer, one can give a general 
specification for shaping operant behavior with it: 
select a response with vector properties, follow the oc- 
currence of a particular magnitude of this response 
one or more times with the reinforcer, then withhold 
it until the response magnitude exceeds the value 
previously reinforced, and reinforce this greater mag- 
nitude. Thus by making the presentation of rein- 
forcers intermittent and dependent upon some pro- 
gressively changing property of behavior, one can 
shape behavior toward some ultimate specification 
through successive approximations. By the shaping of 
behavior, one can develop new forms of behavior that 
could not exist without an explicit history of differ- 
ential reinforcement. Nevertheless, important aspects 
of the shaping process are still unknown. Different 
responses vary in stereotypy, rate of occurrence, dis- 
creteness of identification, and the extent to which 
they are changed by consequent events. A consequent 
event can be more easily used to increase the fre- 
quency of occurrence of an operant with a low initial 
frequency than one with a high initial frequency. Be- 
cause the presentation of a reinforcer tends to enhance 
behavior, it is easier to shape a response involving 
some discrete activity than a response involving sus- 
tained immobility, In fact, it may be difficult to shape 
an operant involving little or no movement, such as 
“holding” or ‘‘standing still” (Blough, 1958), Yet 
operants come under the schedule control of conse: 
quent events even when their average rate is refractory 
to change (Skinner & Morse, 1958). 

The importance of the shaping sequence is recop- 
nized in cases where the final form of behavior did not 
occur initially, which is the situation usually described 
to illustrate the principle of shaping. Under such 
circumstances it is Clear that transitional behavior is 
essential in developing the final behavior. It is less 
often recognized that an individual's past experience— 
how his behavior has been shaped—is usually a de- 
terminant of his subsequent behavior. Even in dealing 
with repetitive responses such as a rat pressing a lever 
or a pigeon pecking at a disc, the quantitative effect 
of presenting a particular event after an instance of 
such a response depends on the subject’s history. For 
example, when food is presented after every 100 re- 
sponses under a fixed-ratio schedule, responding may 
be well maintained in a subject with a history of 
responding under this or other schedules but not in a 
subject without an appropriate history. The effective- 
ness of an event in maintaining a sequential pattern 
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of responding depends on the ongoing pattern of re- 
sponding itself, which in turn depends on the subject’s 
experimental history. These topics will be taken up 
in following sections. 


DISPARATE EFFECTS OF 
CONSEQUENT EVENTS 


A remarkable diversity exists in the physical char- 
acteristics of events that can reinforce behavior. In- 
cluded among these events are: food, water, sex, elec- 
tric shock, and intracranial stimulation; changes in 
lights, sounds, temperature, and gravity: opportunity 
to explore, run, groom, lick a stream of air, play, or 
fight: and the injection of various drugs. It has been 
assumed wrongly that the reinforcing or punishing 
effect of an event is a consistent property of the event 
itself; the presentation of food after a response has 
been considered an inherently positive event that will 
enhance subsequent responding, while the presenta- 
tion of electric shock after a response has been con- 
sidered an inherently negative event that will suppress 
subsequent responding. That food presentation may 
not affect responding in an animal that has not been 
deprived of food 1s usually considered a quantitative 
variation in the effect of food presentation rather than 
evidence against food presentation having inherent 
properties as a reinforcer. But the effects of reinforc- 
ing events are not invariant. Even under a given 
degree of deprivation, the presentation of food to an 
individual may not have a consistent reinforcing 
quality. One may have an aversion to a particular 
food as a child, be indifferent to it as an adoléscént, 
and eat it readily as an adult, It is less widely recog- 
nized that under appropriate conditions the suppres- 
sive effects of electric shock presentation can be re- 
duced or even converted to an enhancing effect. Such 
disparate effects of consequent events are most likely 
to occur when there is a history of schedule-controlled 
responding and when there are multiple determinants 
of behavior. 

In common practice experimenters normally use 
consequent events that do reliably modify the re- 
sponse classes that are being studied and do have 
generality from individual to individual. In most 
experiments in which food is used as the reinforcing 
consequence, the subject is initially deprived severely 
(65 to 80% of free-feeding body weight) and its re- 
sponse to the presentation of food preempts other 
activity. The presentation of food is made contingent 
upon a simple response, such as pressing a key for a 
rat or monkey, that occurs infrequently before it is 
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followed by food presentation. Under these conditions 
the subject’s behavior changes in a predictable way. 
The development of standardized equipment and the 
use of standardized procedures has increased the like- 
lihood that any experimenter can reproduce such re- 
sults. Unfortunately, this success in engendering be- 
havior under what are actually special conditions has 
led to uncritical beliefs about reinforcers and 
punishers. 

Of particular significance are instances in which the 
same event can under different conditions reliably 
produce opposite effects on behavior. With events, 
such as presentation of food or water, that charac- 
teristically lead to further behavior on the part of the 
individual, such changes in the direction of effect are 
easily missed. If an individual fails to consume food 
or water presented under the usual conditions, for 
example, the possibly suppressing effects of these 
events may not be apparent. Disparate effects of the 
Same event are most likely to be observed when the 
presentation of the event directly affects the indi- 
vidual. Such changes in the reinforcing (or punishing) 
effectiveness of intracranial stimulation, electric shock, 
and drug injections will be discussed below. 

Opposite effects of intracranial stimulation on 
lever-pressing responses in the rat have been shown by 
Steiner, Beer, and Shaffer (1969). In the initial phase 
of their study, each lever-pressing response on one of 
two levers (lever 5S) resulted in electrical stimulation 
of an area of the hypothalamus; at appropriate 
stimulus parameters, rapid and reliable responding 
was engendered and maintained on lever S. The pat- 
terns of responding and intracranial stimulation were 
tape-recorded. In subsequent phases of the study, 
intracranial stimulations of the same intensity were 
presented to each rat according to the pattern previ- 
ously recorded for that rat. Responses on lever S were 
recorded but had no programmed consequences, 
whereas responses on lever Z postponed the scheduling 
of further intracranial stimulations for 20 sec. Under 
these conditions, the rates of responding on lever S 
decreased to near zero, while rates of responding on 
lever E increased and were stably maintained in sub- 
sequent sessions. ‘I’hese results show that depending 
upon the circumstances, responding could be main- 
tained by either the presentation or the postponement 
of the same intracranial stimulation. 

Opposite effects of presenting electric shock under 
different schedules have been studied by Kelleher and 
Morse (1968a). Squirrel monkeys were trained initially 
under a variable-interval (VI) schedule of food presen- 
tation that maintained a steady rate of responding. 
Then a 10-min fixed-interval (FI) schedule of electric 
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Fig. 1. Alternate periods of maintenance and suppression of responding by different 
schedules of electric shock presentation in a squirrel monkey. Ordinate—cumulative 
number of responses; abscissa—time. Electric shock presentations (12.6 mA) are marked 
by short diagonal strokes on the cumulative record and the event record. The recording 
pen reset to the base line at the end of each 11-min cycle. The paper did not move during 
the 1-min time-out period at the end of each cycle. During the first 10 min of each cycle, 
positively accelerated responding, characteristic of performance under fixed-interval (FI) 
schedules, was maintained; during the last min of each cycle, in which each response 
produced an electric shock, responding was suppressed. (From Morse & Kelleher, 1970.) 


shock presentation was superimposed upon the sched- 
ule of food presentation. When the schedule of food 
presentation was eliminated, responding characteristic 
of FI schedules could be maintained by the schedule 
of electric shock alone. In one experiment, when an 
electric shock was produced by each response during 
the last minute of an 11-min cycle ending with a time- 
out period, responding was positively accelerated dur- 
ing the first 10 min (FI schedule) but suppressed dur- 
ing the last minute of each cycle (Figure 1). Thus 
electric shocks of the same intensity that maintained 
responding under the FI schedule suppressed respond- 
ing during the part of the cycle in which each re: 
sponse produced electric shock. 

Another study suggests that intravenous injections 
of nalorphine, a drug that antagonizes the actions of 
morphine, can function in seemingly opposite ways in 
rhesus monkeys (Goldberg, Hoffmeister, Schlichting, 
& Wuttke, 1971). The administration of nalorphine to 
a morphine-dependent monkey precipitates an imme- 
diate and severe withdrawal syndrome. In one phase of 
the study, morphine-dependent monkeys were trained 
under a schedule in which key pressing produced 
intravenous injections of morphine. After stable per- 
formance had developed, injections of either saline or 
nalorphine were substituted for morphine. Although 
response-produced nalorphine injections did precipi- 
tate a severe withdrawal syndrome, response rates 
were higher than those maintained by morphine or 
saline. In a subsequent phase of this study with mor- 
phine-dependent monkeys, intravenous injections of 
nalorphine were automatically delivered in the pres- 
ence of a stimulus; responding terminated the stim- 
ulus and the associated injections. Under this schedule 
of stimulus-injection termination, responding was 


well maintained with nalorphine injections but not 
with saline injections. These results with intravenous 
injections of nalorphine, like those obtained with 
intracranial stimulation and with electric shocks, indi- 
cate that factors such as the controlling schedule can 
determine the effect of an event on behavior. Inter- 
estingly, the experiments on electric shock are more 


puzzling to many people than those on intracranial 
stimulation or drug injections. This undoubtedly 
occurs because the latter two situations are relatively 
unfamiliar, suggesting that tacit commonsense netiens 


pervade scientific thinking more than is usually 
realized. 


With quantitative variations in the magnitudes of 
consequent events, the possibilities of varied effects 
are increased. Simply altering some parameter of a 


consequent ¢yent can completely change the subse- 
quent frequency of occurrence of responses which 
produce the event.4 Responses that produce intracta- 


nial stimulation, for example, will usually increase in 
frequency as the intensity or duration of stimulation 


is increased over some range of values; however, at 
higher values responding will decrease and eventually 
cease. When responding is maintained by response- 


4 When suppression of responding occurs only at certain 
parameter values of a consequent event, the suppression may be 
lasting or transitory. In most instances in which responding is 
suppressed by intense electric shock, the effect has been shown to 
be lasting and is appropriately described as operant punishment. 
Responding will also be suppressed just after it has resulted in 
the presentation of a large amount of food or the injection of a 
high dose of drug. Yet under certain circumstances, it may be 
possible to show that rate of responding generally increases as 
the amount of food or dose of drug increases even if responding 
just after the event is suppressed temporarily. (For example, see 
the description below of the experiment by Hawkins & Pliskoff, 
1964.) 


180 


produced electric shocks, similar increases and de- 
creases in rate of responding would be expected with 
increases in intensity or duration of electric shock. In 
some circumstances, responding is not initiated even 
under conditions in which it has previously been well 
maintained by an event. With intracranial stimula- 
tion, for example, responding may not occur unless 
each daily session begins with an automatically pre- 
sented stimulation. A similar phenomenon is de- 
scribed in the section headed ‘Characteristics of Re- 
sponses” under conditions in which electric shock 
both elicits and modulates responding. Such results 
further indicate the varied effects of environmental 
events in controling responding. 

When an event that occurs after a response in- 
creases the subsequent frequency of occurrence of that 
response, the presentation of the same event after a 
different response or according to a different schedule 
may not affect behavior in the same way. The condi- 
tions required for the suitability of various events in 
modifying behavior can differ markedly, but under 
suitable conditions even different events can function 
similarly. [hus it is imperative to study factors in 
addition to the events themselves which are involved 
in the processes of operant reinforcement and punish- 
ment. 


ONGOING BEHAVIOR 


Both the qualitative and quantitative properties of 
ongoing behavior are important aspects of reinforce- 
ment and punishment. Emphasizing the sequential 
patterning of behavior as a determinant of subsequent 
behavior shifts the focus of the interaction between 
behavior and environmental events toward behavior 
itself. Historically the situation has always been just 
the opposite. A stimulus paired with a reinforcer is 
said to have become a conditioned reinforcer, but 
actually it is the behaving subject that has changed, 
not the stimulus. Similarly, the physical properties of 
a discriminative stimulus are the same before and 
after it controls behavior; it is the subject that has 
become discriminative, not the stimulus. It is, of 
course, useful shorthand to speak of conditioned rein- 
forcers or discriminative stimuli, just as it is conve- 
nient to speak about a reinforcer rather than speaking 
about an event that has followed an instance of a 
specific response and resulted in a subsequent increase 
in the occurrence of similar responses. The latter may 
be cumbersome, but it has the advantage of empirical 
referents. Because many different responses can be 
shaped by consequent events, and because a given 
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consequent event is often effective in modifying the 
behavior of different individuals, it becomes common 
practice to refer to reinforcers without specifying the 
behavior that is being modified. These common prac- 
tices have unfortunate consequences. They lead to the 
erroneous views that responses are arbitrary and that 
the reinforcing or punishing effect of an event is a 
specific property of the event itself. 

The commonly used contingency table describing 
relations between the presentation and withdrawal of 
stimuli and their behavioral effects (Skinner, 1953, 
pp. 73, 185; Rachlin, 1970, p. 79) provides an example 
of the tendency to categorize stimuli in terms of in- 
herent properties. When the borders of the table are 
designated in terms of stimulus classes (positive-nega- 
tive; pleasant-noxious) and experimental operations 
(stimulus presentation—stimulus withdrawal), the cells 
of the table are, by definition, varieties of reinforce- 
ment and punishment. One problem is that the 
processes indicated in the cells have already been as- 
sumed in categorizing stimuli as positive or negative; 
a second is that there is a tacit assumption that the 
presentation or withdrawal of a particular stimulus 
will have an invariant effect. These relations are 
clearer if empirical operations are used to designate 
the border conditions, as shown in Table 1. In this 
case the cells of the table are unambiguously related 
to the designated conditions; the top row indicates the 
process of reinforcement and the bottom row the 
process of punishment. If the presentation of a par- 
ticular stimulus increases behavior under one condi- 
tion and decreases behavior under another condition, 
there is no need for a category of paradoxical rein- 
forcement or punishment. In trying to understand 
why the same stimulus event can have different effects 
on behavior it is no help to consider reinforcement or 
punishment as paradoxical. The characterization of 
behavioral processes depends upon empirical observa- 
tions. The same stimulus event, under different condi- 
tions, may increase behavior or decrease behavior. In 
the former case the process is called reinforcement 
and in the latter the process is called punishment. 

‘The work of Premack (1959, 1965, 197 1) has empha- 
sized that reinforcers and punishers are not discrete 
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fixed classes of events. He has made the intriguing 
proposal that reinforcement and punishment are 
based on the probabilities of responses associated with 
different events. He suggests that when an event 
associated with a high response probability follows an 
event associated with a low response probability, rein- 
forcement will occur; however, when an event asso- 
ciated with a low response probability follows an 
event associated with a high response probability, 
punishment will occur. As noted previously, the 
process of reinforcement can be demonstrated in the 
situation in which the lever-pressing responses of a 
food-deprived rat result in the delivery of food pellets. 
Under the conditions of this demonstration, the 
initial probability that the rat will press the lever is 
low, whereas the initial probability that it will eat the 
food pellets is high; thus reinforcement is easily 
demonstrated. 

A notable aspect of Premack’s formulation is its 
recognition of the relativity of events as reinforcers or 
punishers. This relativity has been demonstrated 
under conditions in which the access of rats to a drink- 
ing tube or an activity wheel could be controlled 
experimentally (Premack, 197 1). Licking on the drink. 
ing tube was sensed automatically by means of an 
electronic circuit and recorded on a counter (drinkom- 
eter). The activity wheel revolved at a preset rate for 
5 sec whenever the rat pressed a retractable lever. The 
initial relative probabilities of drinking and running 
responses were assessed in daily 15-min control sessions. 
‘The rats spent more time drinking than running when 
they were water-deprived, but spent more time running 
than drinking when they were not water-deprived. In 
subsequent experimental sessions, the activity wheel 
was operated only after the rat made a specified num- 
ber of licks on the drinking tube; that is, drinking re- 
sulted in brief periods of forced running. Drinking 
responses were increased above control levels (rein- 
forcement) by operation of the activity wheel in rats 
which were not water-deprived but were decreased be- 
low control levels (punishment) by operation of the 
activity wheel in rats which were water-deprived. 
Moreover, the degree of suppression was inversely re- 
lated to the probability of operating the activity wheel 
in the control sessions. Thus operation of the activity 
wheel could be either a reinforcer or a punisher de- 
pending on whether the initial relative probability of 
running was high or low. 

The results of some experiments with two-compo- 
nent chained schedules seem inconsistent with the no- 
tion that the reinforcing effectiveness of an event is 
directly related to the response probability associated 
with it; that is, responding in the first component can 
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be well maintained by the presentation of the second 
component despite a low rate of responding in the 
second component (see Gollub, Chapter 10 of this 
volume). For example, relations between rates of re- 
sponding controlled by various intensities of intra- 
cranial stimulation and the effectiveness of such stim- 
ulation in controlling behavior have been studied by 
Hawkins and Pliskoff (1964). Each response on one of 
two response keys resulted in electrical stimulation of 
an area of the hypothalamus; this response key (key B) 
was retracted from the apparatus after every fifth 
response. Responses on the other response key (key 4) 
were Maintained under a VI schedule by the reintro- 
duction of key B. As the intensity of the intracranial 
stimulation was increased over a range of parameter 
values, the rate of responding on key B (computed 
from the latency of the first of five responses) in- 
creased and then decreased (or simply decreased). 
Over the same range of intensities, however, the rates 
of responding on key A increased. Thus under this 
two-component chained schedule, the effectiveness of 
the second component in maintaining responding in 
the first component was directly related to the inten- 
sity of intracranial stimulation but was not related in 
any simple way to rate of responding in the second 
component. This type of experiment, like those de: 
scribed in the preceding seetion, indicates the impor- 
tance of the schedule of presentation ef any event in 
determining how it affects behavior. 

Premack’s promising theoretical account of the con- 
ditions under which an event will fuinction as a réin- 
forcer or as a punisher is being refined and extended, 
but it is still difficult to apply in some situations. One 
problem is how to assign an initial probability of ve- 
sponse to certain types of events—lor example, events 
such as intracranial stimulation or intravenous drug 
injection that are delivered directly to the animal. 
Premack notes that it should be possible to develop 
indirect ways of assessing the initial response probabil- 
ities of such events. It seems likely, however, that this 
indirect approach would entail the same difficulties as 
the discriminative stimulus hypothesis of condi- 
tioned reinforcement (see Gollub, Chapter 10 of this 
volume). Although the point of view in the present 
chapter is similar to that of Premack in stressing the 
relativity of events as reinforcers and punishers, our 
emphasis is on the ongoing rate of responding at the 
time an event occurs and on the way in which the 
event is scheduled. The role of schedules will be con- 
sidered in more detail at the end of this chapter. 

It is clear that the effect of a given consequent 
event on rate of responding is likely to be different 
when it follows responding occurring at different fre- 


182 


quencies. Depending on the frequency of ongoing re- 
sponding, behavior may be modulated more than 
changed in absolute level. For example, Skinner and 
Morse (1958) studied rats running in an activity 
wheel under conditions in which running resulted in 
the presentation of a food pellet under a 5-min FI 
schedule. The rats characteristically paused for a rela- 
tively long period of time after each food presentation 
and then ran until food was presented again. Whether 
or not the overall rate of running was increased or 
decreased from the level of running that prevailed 
when the schedule was not in effect, the pattern of 
running became orderly with respect to the schedule 
of food presentation. 

Reinforcement depends upon the quantitative 
properties of behavior, so that different responses are 
modified differently. It is usually easier to increase the 
frequency of an operant that is occurring infrequently 
than that of an operant that is occurting frequently. 
Although interactions between the levels of ongoing 
behavior and consequent events have tended to be 
ignored in studies on reinforcement, these considera- 
tions have become increasingly important with the 
development of techniques for engendering strong, 
reproducible patterns of behavior. 

Much information is available on the importance 
of ongoing maintenance conditions in determining 
the effects of consequent noxious stimuli. As noted 
previously, the defining operation of a punisher 1s a 
subsequent decrease in the frequency of responses 
similar to one that immediately preceded the 
punisher. There is no fundamental logical difference 
between the punishment and reinforcement situa- 
tions; in both there is an assumption that the level of 
behavior before the presentation of the event 1s 
measurable and sufficiently reproducible to permit 
identification of changes in its rate. In dealing with 
reinforcement, the level 1s usually low or is developed 
through shaping, and experimenters may easily con- 
duct experiments without forcing themselves to pon- 
der the determinants of behavior in the absence of 
reinforcement (cf. Segal, 1972). In dealing with 
punishment, the practical situation is entirely difter- 
ent. Measurable levels of some behavior are required 
both before and after the introduction of the punish- 
ing event; in practice this is usually accomplished by 
using some schedule of reinforcement to engender a 
sustained rate of responding. Here the experimenter 
is forced to use some explicit condition, and therefore 
evidence has accumulated on the effects of using 
noxious stimuli as consequent events under different 
maintenance conditions (Azrin & Holz, 1966; Fantino, 
1973). For example, when a brief electric shock fol- 
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lows each response under FR (fixed-ratio) and FI 
schedules of food presentation in the pigeon, a pattern 
of suppression develops that is different under the two 
schedules (Azrin, 1959; Holz & Azrin, 1962). Under 
a single type of schedule, the effects of response-con- 
tingent electric shocks may be different when they are 
introduced in different temporal relations to the con- 
sequent event (Holz & Azrin, 1962). Besides the type 
of maintenance schedule, other parameters are also 
important in determining the effects of noxious 
stimuli as consequent events. When behavior is main- 
tained under VI or FR schedules of food presentation, 
the suppressive effect of response-produced electric 
shocks is critically dependent upon the degree of food 
deprivation (Azrin, 1960; Azrin, Holz, & Hake, 1963). 
For example, the suppression produced by an intense 
shock delivered every 100 responses became progres- 
sively greater as the maintenance body weight of the 
subject was increased from 60 to 85% (Azrin et al., 
1963). This finding shows clearly that the effect of a 
response-produced electric shock depends on the pre- 
vailing conditions. In this case, whether the same in- 
tense electric shock suppressed behavior or not de- 
pended on the degree of food deprivation. 

Because the suppressive effects of response-pro- 
duced electric shocks do depend upon the exact main- 
tenance conditions, experiments on the effects of in- 
troducing response-produced electric shocks have 
often yielded different results. At one time such differ- 
ences were interpreted as indicating that punishment 
was a less reliable behavioral process than was rein- 
forcement. Such differences are due entirely to a lack 
of comparability in other features of the situation be- 
ing studied; when the maintenance conditions of 
experiments are comparable, the effects of response- 
produced electric shocks are comparable and repro- 
ducible from experiment to experiment. 

‘The process of reinforcement is as dependent upon 
variations in enyironmental conditions as is_ the 
process of punishment. The historical difference be- 
tween reinforcement and punishment is that a greater 
range of maintenance conditions has been studied 
with punishment, which explains why punishment 
may have appeared to be variable. In actual fact, 
studies on reinforcement have dealt mostly with re- 
strictive, idealized cases, which may have given the 
false impression that the effects of known reinforcers 
are not critically dependent upon conditions under 
which they operate. ‘The important point is not that 
punishment is a variable process, but that both 
punishment and reinforcement depend upon the 
quantitative conditions of the environment. When 
one considers the potential range of environmental 
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conditions under which behavior can be studied, per- 
haps more is known about punishment than about 
reinforcement. 


CHARACTERISTICS OF RESPONSES 


Although the range of behaviors that can be con- 
trolled by operant conditioning is vast, the type of 
response selected for measurement can be critical. 
Responses of the classes most commonly used have 
the following characteristics: they are easily identifi- 
able so that repeated instances can be reliably 
counted; they are easily recorded with automatic 
equipment; they have short durations; and they are 
readily repeatable. Some operants are easily estab- 
lished and “well behaved.” In contrast, certain types 
of species-specific responses, especially elicited re- 
sponses, are difficult to control directly by reinforce- 
ment or punishment. 

Responses elicited by electric shock in the squirrel 
monkey are of interest because their temporal pattern- 
ing can be modulated by consequent events. These 
stereotyped patterns of behavior include attacks on 
other members of the same species or on certain other 
nearby objects (Azrin, Hutchinson, & Hake, 1967: 
Hake & Campbell, 1972: Hutchinson, Azrin, & Hake, 
1966; Hutchinson, Azrin, & Renfrew, 1968; Hutchin- 
son, Renlrew, & Young, 1971), If the monkey is par- 
tially restrained in a chair, for example. electric shocks 
to the monkey's tail will cause it to pull and bite a 
jJeash attached to its collar. In one study, electric shoek 
was used both to elicit and to modulate leash-pulling 
responses (Morse, Mead. & Kelleher. 1967). The leash 
was fastened to a lever so that biting and pulling on 
the leash repeatedly closed a switch attached to a 
lever, Two of the three monkeys were studied initially 
under an FT (fixed-time) schedule in which an electric 
shock was delivered automatically every 60 sec. Each 
electric shock clicited pulling and biting the leash. 
which caused a burst of switch closures temporally 
related to the biting and pulling. The burst of switch 
closures just after shock usually ceased abruptly after 
a few seconds; however, a few more switch closures 
often occurred just before the next electric shock was 
delivered. As the session proceeded, the number of 
switch closures just after shock tended to decrease, 
while the number occurring just before shock tended 
to increase (Figure 2A). 

Subsequently, the schedule was changed so that the 
first closure of the switch 30 sec after an electric shock 
produced the next shock; if no switch closure occurred 
between 30 and 60 sec, the shock was delivered auto- 
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Fig. 2. Different patterns of responding (ewitch closures related 
to pulling and Diling a leash) in a squirrel monkey under fixed: 
time (FT {A} and FI (B) schedules of clectric shock presenta- 
tion, and rapid ceseation of reeponding whan eshseke ware nat 
presented (G). Ordinate—cumulative number of TPSPONses, 
a scissa—time. Electric shock presentations (7 mA} are indicated 
by chort diagonal strokes on tha sumulative veddrd; strokes on 
the event record indicate shocks delivered under the FT 60-cac 
scheduls, The reserding pen reset to the base linc whencver 250 
responses accumulated and at end of session. A—Session 18, ET 
60-sec schedule: KB—Sacsions 5, 9, and OO, shoeks aehadulad under 
a FI 30-sec schedule; C—Session 104, no shocke scheduled. In 
part A responding occurred predominantly after shocks, pro» 
ducing a pattern of deceleration. With continued exposure 
to the FI schedule (B), responding occurred predominantly 
before the shock, producing a pattern of acceleration, When 
shocks were omitted (C), few responses occurred, (From Morse 


& Kelleher, 1970.) 


matically at 60 sec after the previous shock. Under 
this FI 30-sec schedule, the switch closure was con- 
sidered a response, defined by its relation to the shock. 
Initially, this response occurred predominantly after 
an electric shock; however, most shocks were produced 
by a response occurring between 30 and 60 sec after 
the preceding shock (left of Figure 2B). With further 


184 


200 RESPONSES 


30 MINUTES 


exposure to the FI 30-sec schedule (right of Figure 
ZB), responding declined soon after an electric shock 
was delivered and then increased until the first re- 
sponse after 30 sec produced the next shock. Only the 
first electric shock in most sessions was deliyered auto- 
matically. When electric shocks were not delivered 
(Figure 2C), this monkey seldom responded. 

The rapid loss of responding in the absence of clec- 
tric shocks distinguishes this FI pattern of responding 
froin other FI patterns of key pressing engendered in 
the squirrel] monkey by the presentation of electric 
shock (Kelleher & Morse, 1968a) or by the termination 
of a stimulus-shock complex (Morse & Kelleher, 1966). 
Although the performances were developed and main- 
tained by the shock, two of the monkeys usually began 
responding only after a shock occurred. In another 
monkey, responding was maintained under only the 
FI 50-sec schedule (no shocks delivered automatically) 
and then under 4a FI 5-min schedule of electric shock 
presentation (Figure 3). A positively accélerated re- 
sponding characteristic of FI schedules subsequently 
developed, most responding occurring before the re- 
sponse-produced electric shock. The leash pulling and 
biting controlled by the electric shock appear to have 
characteristics of both elicited and operant behavior 
(see also the section on response-produced electric 
shocks near the end of the chapter). 


ADVENTITIOUS REINFORCEMENT AND 
PUNISHMENT: IMPORTANCE OF HISTORY 


Adventitious relations between behavior and the 
occurrence of some event are especially useful in 
understanding reinforcement and punishment. In this 
case, the event is presented in time independently of 
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Fig. 3. Development of  posi- 
tively accelerated responding 
(switch closures related to pull- 
ing and biting leash) in a squir- 
rel monkey under a 5-min FI 
schedule of electric shock pre- 
sentation (7 mA). Recording as 
in Figure 2. Top—Session 117, 
initial performance under 5-min 
FI schedule after 30-sec FI 
schedule; bottom—Session 153. 
(From Morse & Kelleher, 1970.) 


any particular behavior. For example, if a food pellet 
is delivered to a rat according to an FT schedule 
under suitable conditions (appropriate type of food, 
degree of deprivation, and temporal parameter of the 
schedule), some identifiable sequence of behavior will 
develop. ‘There is no correct response or problem 
solution in this situation, but the rat’s behavior is 
changed. The delivery of the pellets inevitably follows 
some operant feature of the rat’s behavior. This 
feature then becomes more prominent and more likely 
to be followed by a subsequent pellet delivery and 
thus becomes a maintained response. A positively 
accelerated pattern of responding can be developed 
and maintained under such an FT schedule of food 
presentation (Skinner, 1948). 

Adventitious punishment differs from adventitious 
reinforcement only in the direction of the effect (cf. 
Fantino, 1975), but in this case it is necessary to study 
some prominent feature of behavior that can be de- 
creased in frequency. For example, Azrin (1956) main- 
tained key pecking in pigeons under a VI schedule of 
food presentation and then presented an intense elec- 
tric shock according to an FT schedule. A negatively 
accelerated pattern of responding was engendered and 
maintained under this FT schedule. 

In determining the effects of events presented inde- 
pendently of responding, the quantitative properties 
of ongoing behavior are especially important. These 
properties, in turn, depend on such factors as the 
history of the individual and the tendency of the 
event to elicit responding. It has become accepted that 
an event is more likely to change ensuing behavior 
when it coincides with certain features of behavior 
than when it coincides with other features. Thus 
different behaviors vary in their frequency of occur- 
rence and in their susceptibility to modification by 
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response-independent environmental events (Staddon 
& Simmelhag, 1971; also see Schwartz & Gamzu, Chap- 
ter 3 of this volume). In pigeons with a long history 
of responding under schedules of response-dependent 
food presentation, the presentation of food under FT 
schedules can maintain responding indefinitely. Al- 
though the rates of responding under such FT sched- 
ules are characteristically lower than those under com- 
parable FI schedules, similar patterns of responding 
can be maintained (Zeiler, 1968). In pigeons which 
have had only three response-dependent food presen- 
tations, however, FT schedules maintain low rates and 
erratic patterns of responding (Neuringer, 1970). It is 
frequently asserted that FI responding maintained by 
response-produced electric shocks is basically different 
from other instances of FI responding; however, 
phenomena involved in studies of electric shock are 
analogous to those involved in studies of food presen- 
tation. As noted in the preceding section, electric 
shocks do elicit various responses, but similar phenom- 
ena are reported in studies of autoshaping with food. 
Moreover, studies comparing FT and FI schedules of 
electric shock presentation have produced results 
similar to those obtained with schedules of food 
presentation; that is, under the F'T schedules, rates of 
responding were relatively lower and responding was 
not always positively accelerated (Kelleher, Riddle, & 
Cook, 1963; McKearney, 1974; Morse & Kelleher, 
1970), 

Experimental history can be critical in determining 
whether an event occurring independently of respond- 
ing will result in adventitious punishment or adven- 
titious reinforcement. For example, responding main- 
tained under a schedule of food presentation is 
suppressed under many conditions in which response- 
independent electric shocks are delivered intermit- 
tently (Azrin, 1956; Estes & Skinner, 1941). When re- 
sponding in the rhesus monkey was maintained under 
a schedule of electric shock postponement, however, it 
was found that superimposing an FT’ schedule of 
electric shock delivery markedly increased responding 
(Sidman, Herrnstein, & Conrad, 1957). Moreover, 
when electric shocks were no longer scheduled under 
the avoidance procedure, responding not only per- 
sisted under the FT schedule but became positively 
accelerated as “the animal lever-pressed right ‘into’ the 
shock” (Sidman, Herrnstein, & Conrad, 1957, p. 53). 
Many subsequent studies have shown that the delivery 
of electric shocks independently of responding can en- 
hance responding in animals that have responded 
under schedules of electric shock postponement (for 
example, Kelleher, Riddle, & Cook, 1963; Waller & 
Waller, 1963). After avoidance responding had been 
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Fig. 4. Patterns of responding under a 10-min FT schedule of 
electric shock presentation (3 mA) in squirrel monkeys with a 
history of responding under schedules of electric shock post- 
ponement. Ordinate—cumulative number of responses; abscissa— 
time. Short diagonal strokes on the cumulative record indicate 
shock presentations. he records of monkeys K5 and K31 have 
been broken into 30-min segments and displaced along the 
abscissa; those of monkey K28 have been broken into 10-min 
segments. Note that the pattern of responding in many of the 


individual segments were S-shaped because responding decreased 
near the end. (From Kelleher, Riddle, & Cook, 1963. 1963 by 
the Society for the Experimental Analysis of Behavior, Inc.) 


well established and then extinguished in squirrel 
monkeys, for example, responding recovered when 
electric shocks were presented independently of re= 
sponses, but ceased again when no shocks were de- 
livered. When electric shocks were delivered under a 
10-min FT schedule, substantial levels of responding 
were maintained, as shown in Figure 4. ‘The patterns 
of responding were comparable to those that have 
been described under FT schedules of food presen- 
tation. 

Circumstances in which adventitious punishment 
could be changed to adventitious reinforcement were 
first described by Herrnstein and Sidman (1958). 
Initially, responding of rhesus monkeys under a sched- 
ule of food presentation was suppressed in the pres- 
ence of a clicking sound by intermittent electric 
shocks delivered under an FT schedule. Then the 
monkeys were trained to respond under a schedule in 
which responses postponed electric shocks. Finally, the 
schedule of food presentation was reinstated, but 
when the clicking sound associated with the FT 
schedule was presented, responding was enhanced 
rather than suppressed. Whether response-indepen- 
dent electric shocks suppressed or enhanced respond- 
ing depended on the experimental history of the 
monkey. 
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CRITERIA FOR COMPARING 
CONSEQUENT EVENTS 


The fundamental importance of orderly changes in 
rate of responding to the study of behavior was dis- 
cussed in the first section of this chapter. Operants 
were defined as functionally identifiable, reproducible 
classes of responses. The behavioral changes associated 
with increases 1n responding were considered rein- 
forcement, and those associated with decreases in re- 
sponding were considered punishment. It was noted 
that various schedule conditions gaye rise to charac- 
teristic reproducible patterns of responding in time, 
yet little consideration was given to the criteria for 
identifying these different reproducible behavioral 
processes. These criteria are not absolute, They de- 
pend very much on the “state of the art” and the 
consensus of contemporarias. 

In comparing perlormances in the earliest experi- 
ments on schedule-controlled behavior maintained by 
food presentation with performances in more recent 
expcriments, it is clear that progress has been made in 
achieving reproducibility and control of behavior. In 
general, this impreyement in control of behayior is 
not because of any change in the properties of the 
food used ta maintain behavior, although it might be 
said that there had been a change in the effect of the 
food. The change has come about because optimal 
parameters of various features in the situation have 
bééii combined, including the parameters bf the con- 
sequent event, the location and nature of external 
stimuli, the types of keys used, the reliability of the 
controlling equipment, the conditions of deprivation, 
thé training conditions, and the expériénce of the sub- 
ject. Because various combinations of conditions will 
suffice and because no single feature is likely to be es- 
sential, it may not be always possible to explain the 
reasons for technical advances. In some instances, spe- 
cic changes in current practices have been shown to 
be important. For example, if food-deprived rats are 
maintained at 60-65% of ad lib body weight rather 
than at 80%, characteristic schedule-controlled perfor- 
mances are more easily obtained. In the development 
of stimulus control, the location, intensity, and dura- 
tion of the controlling stimuli and the schedule un- 
der which they are presented can result in “errorless” 
discriminations, whereas only slightly different condi- 
tions can result in a much slower development of con- 
trol. Perhaps the most important ingredient of ad- 
vances in experimental control] is the explicit attempt 
by investigators to achieve greater control. After the 
initial work showing the possibility of errorless dis- 
criminations (Terrace, 1963), many other investigators 


DETERMINANTS OF REINFORCEMENT AND PUNISHMENT 


P-4 
05 MINFI Va 50 MIN FI 
10 MIN Fi 5 MIN FI 


RAT LEVER WATER RAT LEVER FOOD PIGEON KEY FOOD 


fs 
- i i 
a a 


[ ele tinct 


500 RESPONSES 


PIGEON KEY WATER 
AAA AS 


PIGEON KEY FOOD 


500 RESPONSES 


: RAT LEVER FOOD 
we 
' : | 
S | / / / \ RAT WHEEL FOOD CAT KNOB FOOD 
BD ik We cic oe ie : ames ie 
il] § fr 
CHIMPANZEE LEVER FOOD Fall - ie z 
Sa r. 
ee jes) -_ fy 


Se 
9 MINUTES 


Fig. 5. Generality of characteristic FI performance (no respond- 
ing, then acceleration to a maintained steady rate of responding). 
Ordinate—cumulative number of responses; abscissa—time, An 
FI schedule of presentation of food or water was in operation 
in all examples shown in this figure. Upper frame—individual 
pigeon (P-4) pecking plastic key (food). Three different dura- 
tions of the fixed interval are shown; the general pattern 
persists despite the hundredfold change in the schedule param- 
éter. Food presentations, ending each fixed interval, are 
marked by short diagonal strokes on the cumulative record. 
Lower left frame—performances under a 10-min FI schedule. 
Food or water presentations, ending each interval, are marked 
by the resetting of the recording pen to the base line. Lower 
right frame—performances under a 5-min FI schedule. The 
species, the type of switch recording the response, and the 
reinforcer presented are indicated above the records. The pigeon 
pecked a plastic key with its beak: the rat and chimpanzee 
pressed a horizontal lever with their paws; the cat depressed 
a rounded knob with its paw. The rat turned the wheel by 
running: only a turn of 180° is reinforeed, but the cumulative 


distance the wheel turns is recorded directly. (From Kelleher 
& Morse, 1968b.) 


soon found it possible to develop discriminative per- 
formances very quickly. Given an explicit description 
of what behavior is to be achieved, the conditions 
sufhcient to realize the result can usually be found. 

Schedule-controlled patterns of responding give a 
meaningful way of comparing different species, differ- 
ent maintenance events, or other interventions. Sched- 
ule-controlled patterns appear to have great general- 
ity; they occur in diverse species with a variety of 
different maintenance events (Figure 5). Different 
schedule performances depend upon the particular 
maintenance conditions. Usually, subjects with similar 
past experience exposed to the same parameter values 
can be expected to respond comparably, although the 
actual rates of responding may differ somewhat (Wal- 
ler & Morse, 1963). ‘To produce the same response rate, 
the parameters of the schedule may have to be differ- 
ent for different individuals. The value of producing 


W.H. Morse and R. T. Kelleher 


Fig. 6. Patterns of responding of three species (pigeon, rat, and 
monkey) under mult (multiple) FI FR schedules of reinforce- 
ment. (From Skinner, 1956. © 1956 by the American Psycho- 
logical Association. Reprinted by permission.) 


such comparable rates anc, patterns of schedule per- 
formances is that these reproducible temporal patterns 
represent an invariant behavioral process. Stevens 
(1951, pp. 20-21), discussing the importance of in- 
variance as a tool of thought, concludes by saying: 
“The scientist is usually looking for invariance 
whether he knows it or not. . . . The delineation of 
the conditions of invariance for any phenomenon 
would tell us all we want to know about the matter.” 
This same point of view is expressed by Skinner (1956, 
pp. 230-231) in commenting on Figure 6, which shows 
performances of a pigeon, rat, and monkey under a 
mult (multiple) FR FI schedule: 


Pigeon, rat, monkey, which is which? It doesn’t 
matter. Of course, these three species have be- 
havioral repertoires which are as different as 
their anatomies. But once you have allowed for 
differences in the ways in which they make con- 
tact with the environment, and in the ways in 
which they act upon the environment, what 
remains of their behavior shows astonishingly 
similar properties. Mice, cats, dogs and human 
children could have added other curves to this 
figure. And when organisms which differ as 
widely as this nevertheless show similar prop- 
erties of behavior, differences between members 
of the same species may be viewed more hope- 
fully. Difficult problems of idiosyncrasy or indi- 
viduality will always arise as products of bio- 
logical and cultural processes, but it is the very 
business of the experimental analysis of behavior 
to devise techniques which reduce their effects 
except when they are explicitly under investiga- 
tion. 
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Fig. 7. Characteristic FI performance in the squirrel monkey 
under a multiple schedule of stimulus-shock termination and 
food presentation (Monkey S-50). The arrow indicates the 
change from the schedule of stimulus—shock termination to the 
schedule of food presentation. Left of the arrow—in the presence 
of a white light, electric shocks were scheduled to occur at 3-sec 
intervals starting after 5 min; the first response after 5 min 
terminated the stimulus-shock complex for 1 min. No shocks 
were delivered in the record segment shown. Right of the 
arrow—in the presence of a red light, the first response after 5 
min was followed by food presentation and terminated the light 
for 1 min. Food presentations are indicated by short diagonal 
strokes on the cumulative record. The recording pen reset to 
the base line at the end of each fixed interval. The recorder did 
not run during the minute of darkness following each fixed 
interval. (From Kelleher & Morse, 1968Db.) 


One might note that the type of maintenance event, 
net specified in this example, is not critical. Figure 7 
shows, for example, FI performances in a squirrel 
monkey maintained by the presentation of food and 
by the termination of a stimulus complex comprising 
a visual stimulus and an associated shock schedule 
(Kelleher & Morse, 1968b; Morse & Kelleher, 1966). 

Schedule performances are invariant in part be 
cause techniques have been devised that produce in- 
variance. By “delineating” the conditions of invari- 
ance for different species or for different maintenance 
events, there is a meaningful behavioral basis for com- 
paring the effects of other independent variables. Of 
course, there are various bases, both formal and 
empirical, for making comparisons among different 
conditions, but a compelling argument can be made 
that comparisons among different events should be 
made on the basis of their similarities rather than 
their differences. The many different events that have 
been used to maintain or suppress behavior function 
similarly under appropriate conditions, but the con- 
ditions required for their suitability as reinforcers or 
punishers are different. Thus the essential aspect in 
studying events as reinforcers and punishers is not any 
inherent property of the events but rather the speci- 
fication of the conditions under which events modify 
behavior. As we have noted, the experience of the 
individual and the schedule under which events are 
scheduled have often been neglected in favor of the 
more static properties of events. The value of dealing 
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with reproducible behavioral processes has already 
been described. The next sections will consider some 
actual instances involving comparisons between differ- 
ent maintenance events that developed from practical 
applications in the field of behavioral pharmacology. 


COMPARISONS OF THE EFFECTS OF 
DRUGS ON PERFORMANCES MAINTAINED 
BY DIFFERENT CONSEQUENCES 


Although there was once considerable work com- 
paring strengths of different drive states, in recent 
times interest in this topic has diminished. This 
change has resulted partly from the repeated find- 
in@ that different schedule-controlled patterns of re- 
sponding can be engendered in individual subjects 
with multiple schedules. Schedule performances em- 
bedy a great deal of what traditionally has been called 
motivation (see the section on response-produced 
electric shocks near the end of the chapter), For other 
reasons, however, behavioral pharmacologists have 
long been interested in determining whether drugs 
have selective and specific effects on behavior con- 
trolled by noxious stimuli as compared with other 
events. Many 1nVvestigators have compared the effects 
of drugs on behavior maintained by presentation of 
food with their effects on behavior maintained by 
termination (of postponement) of electric shock. 
Much of the interest in such comparisons derives 
from motivational interpretations of the clinical uses 
of drugs. After the development of the major and the 
mMifior tranquilizers, these drugs were soon used widely 
in the clinical treatment of psychiatric disorders in- 
volving agitation, apprehension, tension, or anxiety 
states. Consequently, it was generally accepted that the 
effects of these drugs on behavior would be under- 
stood in terms of their diréct effects on underlying 
motivational states or drives, 

Such motivational interpretations of the clinical 
uses of drugs promoted interest in experimental study 
of how drugs affect behavior controlled by noxious 
stimuli. It has been assumed that noxious stimuli 
control behavior by engendering an emotional state of 
fear or anxiety; changes in behavior after a drug have 
been explained as changes in this emotional state. 
Motivational interpretations have also been applied 
to the effects of drugs on behavior maintained by food 
presentation or water presentation; changes in be- 
havior after drugs have been explained as changes in 
hunger or thirst. 

Because hypothetical drive states, such as hunger or 
anxiety, are assumed to depend upon controlling en- 
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vironmental events, such as deprivation of food or 
presentation of electric shock, the generality of pre- 
dictions about drugs affecting underlying motivational 
or emotional states can be evaluated in objective ex- 
periments. If drugs directly affect motivational states, 
the kind of effect a drug has on different behaviors 
should then depend on similarities or differences in 
the events controlling the behavior. Relevant experi- 
mental studies refute this view. 

First, different patterns of responding maintained 
by the same event are selectively affected by drugs 
even when these patterns repeatedly alternate under 
a multiple schedule during the same session (see 
Kelleher & Morse, 1968b). In such instances, the differ- 
ential effects of drugs cannot be attributed to the 
consequent ¢yent, The direction of the dependency of 
drug effects on schedule performance can differ among 
drugs. Barbiturates decrease responding under many 
parameter values of FI schedules at doses that do not 
decrease responding under FR schedules (Dews, 1955; 
Morse, 1962). Other drugs have the opposite effect: 
responding under FR schedules can be decreased by 
doses of amphetamines that increase responding under 
FI schedules (Kelleher & Morse, 1964; Smith, 1964). 

Because the schedule can profoundly modify the 
effects of drugs, comparable schedules and comparable 
schedule-controlled patterns of responding must be 
established with different events (for example, food 
and electric shock) before there can be meaningful 
comparisons of the effects of drugs on responding con- 
trolled by these events. When schedule conditions and 
performances are comparable, many drugs have simi- 
lar effects on behaviors controlled by different events. 
For example, in the rat responding under a FR 1 
schedule of reinforcement, chlorpromazine decreases 
responding maintained by the presentation of food, 
intracranial stimulation, or heat. Appropriate doses of 
amphetamine increase responding maintained by the 
presentation of food, intracranial stimulation, or 
heat, while higher doses decrease responding (for de- 
tails see Kelleher & Morse, 1968b). 

Experiments by Weiss and Laties (1963) using heat 
as a reinforcer are particularly significant for the 
present discussion because they studied the effects of 
several drugs on skin and body temperature, as well 
as on frequency of responding maintained by heat 
presentation. The experiments were conducted with 
individual shaved rats in a small chamber in a re- 
frigerated room; whenever the rat pressed a lever 
within the chamber, a lamp above the chamber de- 
livered 2 sec of infrared heat. At certain temperatures, 
chlorpromazine decreased rates of responding even 
though it enhanced the rate at which temperature fell 
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in the cold, and amphetamine increased rates of re- 
sponding even though it caused the skin temperature 
to rise significantly. Noting that these effects are simi- 
lar to those obtained when food or water is used to 
maintain behavior, Weiss and Laties (1963, p. 7) con- 
cluded that “the behavioral properties of these drugs 
are largely independent of the reinforcer that main- 
tains the behavior, or, put another way, of the motiva- 
tional state that supports it.” 

Some investigators have reported that chlorproma- 
zine and reserpine have more marked effects on be- 
havior maintained by electric shock than on behavior 
maintained by the presentation of food, Other in- 
vestigators have reported that chlorpromazine has 
more marked effects on behavior maintained by pre- 
sentation of food or intracranial stimulation than on 
behavior maintained by avoidance of electric shock. 
Still other studies comparing behaviors maintained by 
food and by electric shock found no difference in 
sensitivity to reserpine (for details see Kelleher & 
Morse, 1964, 1968b, 1968c). These results reflect on 
the difficulties involved in comparing behavioral 
effects of drugs on performances maintained with 
different reinforcers. In most of the studies the types 
and parameters of the schedules differed as well as 
the consequent events. When different reinforcers are 
presented according to different schedules, the effects 
ofa drug may be largely determined by thé schedule- 
controlled patterns of responding, For comparing the 
effects of drugs on behaviors maintained by different 
reinforcers, it ig useful to start with similar schedules 
of reinforcement; however, there is still no a priori 
basis for equating such parameters as amounts of 
food and intensity of electric shock. It is unreasonable 
to presume that certain parameter values of one arbi- 
trarily chosen schedule of food presentation will be 
comparable to the same parameters of an arbitrarily 
chosen schedule of electric shock termination. 

The most satisfactory way to attack these problems 
is to obtain as nearly as possible identical patterns of 
responding maintained by different events and then to 
establish dose-effect relations for drugs on these pat- 
terns. Functional relations between drugs and_be- 
havior maintained by different schedules with each 
event can then be compared. Earlier it was noted that 
the conditions sufficient to realize a desired behavioral 
performance can usually be found. It is noteworthy 
that two different procedures for establishing com- 
parable patterns of responding with formally com- 
parable schedules of food presentation and electric 
shock termination have been developed. 

Cook and Catania (1964) studied an FI schedule of 
electric shock termination in a group of squirrel mon- 


189 


keys. A pulsating electric shock of low intensity was 
continuously delivered, and the first response after 10 
min terminated the shock; under this schedule, the 
rate of responding depended upon the intensity of the 
pulsating shock. Monkeys of another group were food- 
deprived and studied under an FI 10-min schedule of 
food presentation. ‘The parameters of the schedules 
were selected to give comparable rates of responding, 
and response patterns characteristic of FI schedules 
were maintained under both food presentation and 
shock termination. Chlorpromazine and imipramine 
decreased rates of responding under both types of FI 
schedules, while selected doses of amphetamine, 
meprobamate, and chlordiazepoxide increased rates 
of responding. These results support the view that the 
behavioral effects of these drugs depend mainly upon 
schedule-controlled patterns of responding. 

The other study directly compared the importance 
of type of reinforcer and schedule of reinforcement as 
determinants of the behavioral effects of drugs (Kelle- 
her &% Morse, 1964). Under some conditions the termi. 
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Fig. 8. Characteristic mult FI FR schedule performance main- 
tained in squirrel monkeys by food presentation (upper record, 
monkey S-1) and by stimulus-shock termination (lower record, 
monkey S-26). ‘The sequence of visual stimuli and corresponding 
schedules is the same in the upper and lower records. At the 
beginning of the records, the 10-min FI schedule was in effect 
in the presence of a white stimulus. At the termination of the 
FI component the recording pen reset to the bottom of the 
record, and a pattern of horizontal lines was present for 2.5 
min; during this time-out (TO) period, responses had no pro- 
grammed consequences. The next short diagonal stroke on the 
cumulative record indicates that the 30-response FR component 
was in effect in the presence of a red stimulus. Again, the 
cumulative recording pen reset to the bottom of the record 
at the termination of the FR component and was followed by 
the 2.5-min time-out component. This cycle was repeated 
throughout each session. At the bottom of the record for 
monkey S-26, the short diagonal strokes on the event line 
indicate electric shock (6.2 mA) presentation. (Modified from 
Kelleher & Morse, 1964.) 
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nation of a schedule complex, comprising a visual 
stimulus and an associated schedule of shock presenta- 
tion, can maintain schedule-controlled patterns of 
responding characteristic of FI, FR, and mult FI FR 
schedules in the squirrel monkey (Morse & Kelleher, 
1966). In one series of experiments, responding under 
such a mult FI FR schedule was compared with re- 
sponding under a mult FR FI schedule of food presen- 
tation (Kelleher & Morse, 1964). Although maintained 
by different events, performances under the two multi- 
ple schedules were similar. Representative records for 
two monkeys are shown in Figure 8. ‘The FR compo- 
nent of cach multiple schedule sustained a high rate 
(about 2.3 responses per sec). The FI component of 
each multiple schedule was characterized by a pause 
(period of no responding) followed by acceleration of 
responding to a steady rate; the average rate in the 
interval was about .6 response per sec. 

The offacts of d-amphetamine on responding under 
each of the component schedules are shown in Figure 
9. Except at the highest dose, d-amphetamine in- 
creased rates of responding under both FI schedules 
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Fig. 9. Effects of d-amphetamine sulfate on rates of responding 
under multiple FI FR schedules of food presentation and 
stimulus-shock terminations. Three squirrel monkeys were 
studied on each multiple schedule. Each drug was given intra- 
muscularly immediately before the beginning of a 2.5-hr session. 
At least duplicate observations were made on each monkey at 
each dose level. Summary dose-effect curves for the four 
component schedules were obtained by computing the means 
of the percentage changes in average response rates from control 
to drug sessions. The dashed line at 100% indicates the mean 
control level for each component. The vertical lines on the 
left of the figure indicate the ranges of control observations 
expressed as a percentage of the mean control value. Note the 
general similarity of the pairs of dose-effect curves for FI and for 
FR components. (Modified from Kelleher & Morse, 1964. © 1964 
by the Society for the Experimental Analysis of Behavior, Inc.) 
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but decreased rates of responding under both FR 
schedules. Note that .3 mg/kg of d-amphetamine, 
which produced the maximum increase in rates of 
responding on both FI schedules (relatively low con- 
trol rates), decreased rates of responding on both FR 
schedules (relatively high control rates). Many in- 
vestigators have found that amphetamines tend to 
increase response output under schedules that main- 
tain low rates of responding but tend to decrease re- 
sponse output under schedules that maintain high 
rates of responding. It is often assumed that decreases 
in responding maintained by food presentation are 
caused by anorexic effects of amphetamine even 
though such decreases occur under a variety of condi- 
tions. ‘The similarity of the pairs of dose-effect curves 
in Figure 9 indicates that this interpretation is wrong. 
A mere decrease in responding after amphetamine, or 
any other drug, is not sufficient evidence of anorexia. 
Figure 9 shows that the effects of d-amphetamine de- 
pend more upon the type of schedule than upon the 
scheduled event (Kelleher & Morse, 1968b). 
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Fig. 10. Dependence of effect of d-amphetamine on _ predrug 
rate of responding in a squirrel monkey. Abscissa—average rate 
of responding in successive minutes of a 10-min FI schedule 
(circles) and under a 30-response FR schedule (triangles); ordi- 
nate—relative rate of responding after .3 mg/kg d-amphetamine, 
intramuscularly. Rates of responding were recorded separately 
during the FR component and during successive minutes of 
the FI component. Open and filled symbols indicate data from 
two different sessions. The line through the points was fitted by 
inspection. Based on data of a single monkey used in 
computing the averaged data under FI and FR schedules of 
stimulus-shock termination in Figure 9. (From Kelleher & 
Morse, 1968b. © 1968 by the Society for the Experimental 
Analysis of Behavior, Inc.) 
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Fig. 11. Dependence of effect of 
chlordiazepoxide and mepro- 
bamate on intensity of electric 
shock or on predrug rate of re- 
sponding in the squirrel mon- 
key. Ordinate—relative rate of 
responding after oral doses of 
each drug; abscissa—shock in- 
tensity (left frame) and predrug 
rate of responding correspond- 
ing to each of the three shock 


intensities (right panel). (From 
0.8 1.6 2.5 2 a 6 ‘ e Cook & Catania, 1964. Reprinted 
SHOCK LEVEL BASELINE RESPONSE RATE from Federation Proceedings 23; 

(ma) (resp./min|] 832, 1964.) 


There is a graded relation between the increase in 
low rates of responding and the decrease in high rates 
of responding after amphetamines. As shown in Figure 
10, the relative response rate is an inverse linear func- 
tion of the control rate. The two sets of data points 
are derived from the rates during complete sessions 
after d-amphetamine (.3 mg/kg, intramuscularly) and 
the corresponding rates during the previous control 
sessions under the multiple schedule of stimulus—shock 
termination. This same functional relation has been 
found in several different species under conditions in 
which different predrug rates of responding were en- 
gendered by different schedules of reinforcement, or 
by sampling different temporal periods of a single 
schedule. ‘This model of amphetamine action sug- 
gests that observed increases and decreases in respond- 
ing do not reflect qualitatively different processes. 

In the experiments by Cook and Catania (1964) in 
which squirrel monkeys responded under an FI 10- 
min schedule of termination of electric shock, the 
effects of meprobamate and chlordiazepoxide de- 
pended upon the average predrug rate of responding, 
which in turn depended upon the intensity of the 
electric shock. The proportional increases in rate of 
responding were inversely related to predrug rates of 
responding for both drugs, except that the highest 
predrug rates were slightly decreased by both drugs 
(Figure 11). The rate-dependent effects of meproba- 
mate and chlordiazepoxide appear similar to those of 
the amphetamines and barbiturates, but have not 
been as thoroughly studied. 
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In considering Figure 11 again, on the left side the 
drug effects are shown to depend upon electric shock 
intensity; on the right side they are shown toe depend 
upon the control rate. The two functions are similar 
because variations in shock intensity changed the con- 
trol rate. The function on the left has limited predie- 
trye generality, however, while that en the right hts 
these data into the broader context of rate-dependent 
effects. The advantage of describing these results in 
terms of rate dependencies is that they take on an ap- 
plicability beyond the situation in which they were 
observed. 

Although the actions of many drugs on behavior 
can be quantitatively related to the predrug rate of 
responding, this does not imply that all the behavioral 
effects of drugs can be interpreted as rate dependen- 
cies (see Kelleher & Morse, 1968b). Nevertheless, rate 
dependencies do operate widely and with profound 
effects. In any experiments in behavioral pharmacol- 
ogy, it 1s necessary to take into account the predrug 
rate in order to make valid predictions. Many of the 
seemingly qualitative differences in the effects of 
drugs on different performances result from a quanti- 
tative difference in predrug rates of responding. 

The strong dependence of the effects of drugs on 
schedule-controlled behavior has implications that go 
beyond behavioral pharmacology. It indicates that be- 
havioral processes such as reinforcement or punish- 
ment must be viewed in the context of ongoing 
behavior. Schedule-controlled behavior not only gives 
rise to organized, integrated performances but deter- 


192 


mines how other interventions will further modify 
behavior. Rates and patterns of schedule-controlled 
responding are, therefore, fundamental properties of 
behavior. 


DRUG INJECTIONS AS CONSEQUENT 
EVENTS MAINTAINING BEHAVIOR 


Experiments on the use of drugs as consequent or 
discriminative stimuli provide other instances in 
which the solution of practical problems in behavioral 
pharmacology has contributed to the study of behav- 
lor generally. During the past decade it has been 
demonstrated repeatedly that responding in experi- 
mental animals can bé maintained by the intravenous 
injection of drugs from several different classes (for 
cxample, see Deneau, Yanagita, & Seevers, 1969). The 
injection of the drug thus functions as a reinforcer in 
these situations. As with any environmental event, 
drug injections maintain operant behavior only under 
certain conditions; as more information accumulates, 
the control over behavior improves. For example, 
rates of responding maintained by injections of co- 
caine or d-amphetamine are of the order of 50 times 
greater in current experiments than in some of the 
earliest ones (see Goldberg, 1973). Better control comes 
about through better specification of the relevant 
conditions. 

Much of the research on the self-administration of 
drugs has been motivated by practical problems of 
drug abuse in man, One interest has been in develop- 
ing animal models that would predict the abuse po- 
tential of drugs in man. The most commonly used 
procedure is to allow a subject to inject a given dose 
of a drug with cach response for an extended period 
of time each day (3 to 24 hr). Such procedures are 
capable of distinguishing between many drugs that 
are likely to be abused in man and certain drugs that 
are unlikely to be abused. Comparing the levels of 
behavior maintained by different drugs provides prac- 
tical information of limited generality—as, for exam- 
ple, in comparing the amounts eaten of oatmeal and 
Cream of Wheat. Under some conditions neither 
would be taken, and a starving person would take 
both. It is a mistake to consider drugs as having inher- 
ent reinforcing efficacies. In determining the charac- 
teristics of a drug as a reinforcer or punisher, the 
conditions that are sufficient to develop the same 
operant behavior should be determined. As with food 
or electric shock the capacity of a particular dose of 
drug to maintain behavior depends upon various con- 
ditions and may change over time. Doses of cocaine 
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that will not maintain responding initially may do so 
in individuals with well-developed behavior (Gold- 
berg, 1973). There are many examples of drug-taking 
behavior being modified by a subject’s history. For ex- 
ample, Schlichting, Goldberg, Wuttke, and Hoffmei- 
ster (1971) found that the rate and pattern of respond- 
ing maintained under FR schedules of d-amphetamine 
injections depended on whether rhesus monkeys had 
a history of responding maintained by cocaine, co- 
deine, or pentobarbital. Thus rates and patterns of 
responding maintained by drug injections are a com- 
posite result of the history of the individual, the 
schedule of drug injection, and the dose of drug in- 
jected. 

Previously we have used the term metastable to 
refer to two different stable patterns of responding 
maintained under the same schedule parameters, one 
betore and one after an intervening treatment (Morse 
& Kelleher, 1966, 1970: Staddon, 1965). Instances of 
Opposite effects of consequent events might be viewed 
as extreme cases of. metastability. In an earlier section 
of this chapter it was noted that the drug nalorphine 
can both enhance behavior leading to its presentation 
and enhance behavior associated with its postpone- 
ment (see Goldberg, Hoffmeister, & Schlichting, 1972). 
Intravenously injected drugs, like electric shocks, are 
presented directly to the subject, which makes it easier 
to use the same event in different ways. It may be of 
no fundamental significance that mainly “noxious 
events” have been shown to function in a variety of 
modes. ‘The converse situation for drugs that are gen- 
erally used as “positive” reinforcers has not been 
studied, but Smith and Clark (1972) have shown that 
there are conditions under which food delivery will 
be postponed by food-deprived subjects. Thus it seems 
clear that the maintenance of behavior by self-injected 
drugs is determined by various conditions, only one 
of which is the intrinsic properties of the drug itself. 

Some conditions have been determined under which 
patterns of responding maintained by FR and FI 
schedules of drug injection are comparable to per- 
formances maintained by similar schedules of food 
presentation (Goldberg, 1973; Goldberg, Kelleher, & 
Morse, 1975). An important parameter in any ex- 
periment on drug self-administration is the dose, 
which may critically determine the level of responding 
under certain schedules. The amount of food has not 
seemed important in many studies with schedules of 
food presentation, but it is because the amount of 
food presented has been relatively constant and ap- 
propriate to the schedule parameters. When extreme 
amounts of food are presented or food delivery very 
intermittent, the amount of food presented becomes im- 
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portant (Morse, 1966; see Collier, Hirsch, & Kanarek, 
Chapter 2 in this volume). 

Variations in drug dose and amount of food can 
have similar effects (Goldberg, 1973). For example, 
average rate of responding under 10- or 30-response 
FR schedules first increased and then decreased as the 
dose of cocaine injected was increased or as the amount 
of food presented was increased. Increasing the dose 
of cocaine or the amount of food resulted in a high 
rate of responding at the beginning of each session, 
but rates of responding decreased as the session pro- 
gressed. The effects of varying the amount of drug or 
food were also studied under a second-order FI 
schedule of FR components, each terminating with a 
briefly presented visual stimulus. (A second-order 
schedule is one in which the behavior specified by a 
schedule contingency is treated as a unitary response 
that is itself reinforced according to some schedule— 
Kelleher, 1966; see Gollub, Chapter 10, and Zeiler, 
Chapter 8 in this volume.) Under the second-order 
schedule, response rates remained constant as the 
parameter value of the reinforcer was varied over a 
wide range. Again, the functions relating response 
rate to amount of drug or food were similar to one an- 
other, although they differed from the functions un- 
der simple FR schedules. The lower frequency of 
drug injection or food presentation under the second- 
order schedules limits cumulative effects that may de- 
crease rates of responding (see also footnote 4). Once 
again, the way behavior is controlled ky consequent 
events depends more upon the schedule than the type 
of scheduled event. Although injections of cocaine 
and presentation of food have very different proper- 
ties, striking parallels between drug-maintained and 
food-maintained behavior can be obtained when they 
are studied under comparable schedules, Indecd, we 
may ask whether studies of this nature have implica- 
tions for the abuse of food. 


RESPONSE-PRODUCED ELECTRIC SHOCKS 
AS CONSEQUENT EVENTS 
MAINTAINING BEHAVIOR 


The evidence is overwhelming that behavior is 
more controlled by the nature of the prevailing sched- 
ule than by the nature of the scheduled events. As 
noted earlier, compelling support for this view comes 
from experiments in which the same event has dis- 
parate or opposite effects on behavior when scheduled 
differently. The most thoroughly studied examples of 
such opposite effects are the maintenance and sup- 
pression of behavior by response-produced electric 
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shock (discussed in earlier sections). Until recent years, 
response-produced electric shocks were seldom used 
under conditions in which they increased subsequent 
responding, yet numerous studies have shown diverse 
conditions under which key pressing is reliably main- 
tained by response-produced electric shocks. Many of 
the features mentioned in earlier sections (““The Con- 
tinuity of Behavior in Time” and “Ongoing Behav- 
ior’) are important in developing such behavior; by 
having an existing level of ongoing responding and by 
scheduling the electric shocks intermittently, the 
schedules of shock presentation may come to modu- 
late responding and develop schedule control. 

Various studies have shown that responding can be 
Maintained under FI schedules of electric shock de- 
livery in squirrel monkeys trained under schedules of 
electric shock postponement that engender steady rates 
of responding. For example, McKearney (1968, 1969) 
studied squirrel monkeys trained under such an avoid- 
ance schedule. A 10-min FI schedule of response: 
produced electric shock was then introduced concur- 
rently. Subsequently, when the schedule of electric 
shock postponement was eliminated and only the 
10-min FI schedule of response-produced electric 
shock was in effect, a pattern of positively accelerated 
responding developed and was well maintained. Simi- 
larly, Byrd (1969) has shown in the cat that alter a 
history of postponement of electric shocks, responding 
can be well maintained under an FI schedule of elec- 
tric shock presentation. McKearney (1969) also studied 
a range of electric shock intensities and FI durations. 
As the fixed interval was decreased from 10 to I min, 
patterns of responding {as indicated by quarter-lite 
values) were little affected, but rates of responding 
were inversely related to the FI duration. Responding 
ceased, however, when electric shocks were no longer 
scheduled and redeveloped when shocks were again 
presented under the FI schedule. 

A study by Byrd (1972) has shown that responding 
can be established and maintained under second-order 
schedules of electric shock presentation. Again, in 
squirrel monkeys with a history of responding under 
schedules of electric shock postponement, characteris- 
tic FI patterns of responding were maintained under 
an FI 8-min schedule of electric shock presentation; a 
brief (1-sec) visual stimulus immediately preceded 
each electric shock. Performance was subsequently 
maintained under a second-order schedule in which 
the brief stimulus was presented under an FI 4-min 
schedule component; electric shock was delivered only 
after the completion of four FI components. Charac- 
teristic positively accelerated responding was engen- 
dered in the individual FI components. Patterns of 
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responding maintained by presentation of the brief 
stimulus intermittently associated with delivery of an 
electric shock were similar to those maintained by 
brief stimuli intermittently associated with food pre- 
sentation (see Gollub, Chapter 10 of this volume). 

Under some conditions, the introduction of the FI 
schedule of shock presentation can be abrupt. For ex- 
ample, in one study, squirrel monkeys were trained to 
postpone electric shocks under schedules in which the 
period of time by which shock was postponed de- 
creased with successive responses until a shock was 
delivered automatically (Kelleher & Morse, 1969). Cer- 
tain parameters of this interlocking schedule of elec- 
tric shock postponement engendered a stable pattern 
of positively accelerated responding between electric 
shocks (Figure 12, upper frame). A monkey trained 
under this schedule was then maintained under an 
FI $-min schedule of electric shock presentation; the 
patterns of positively accelerated responding were 
more marked than they had been under the schedule 
of shock postponcment (Figure 12, center and bottom 
frames). 

An experiment was described earlier (sce Figure 1) 
in which responding was both maintained and sup- 
pressed by the same response-produced electric shock 
(Kelleher & Morse, 1968a). This experiment is signifi- 
cant in showing that schedule conditions other than 
electric shock postponement can be used to develop FI 
performances with response-produced electric shack. 
Two monkeys were trained initially under a VI sched- 
ule of food presentation, and then FI schedules of 
electric shock presentation were superimposed on the 
schedule of food presentation. In onc monkey, re- 
spending was initially suppressed under the combined 
schedule but subsequently recovered. Recovery from 
punishment has been frequently observed (Azrin & 
Holz, 1966). ‘The rate of responding of the other mon- 
key was more suppressed but later recovered after 
numerous changes in the schedules of shock presenta- 
tion and after prolonged exposure to low shock in- 
tensities followed by gradually increasing shock in- 
tensities. Eventually, responding was enhanced in 
both monkeys under the combined schedules of food 
presentation and electric shock presentation and con- 
tinued to be maintained when the food schedule was 
eliminated (for details see Kelleher & Morse, 1968a). 
As noted earlier (under “Disparate Effects of Conse- 
quent Events’) in one experiment, the first response 
occurring after 10 min produced an electric shock, and 
each subsequent response during the 11th minute also 
produced a shock. A I-min time-out period occurred 
at the end of the 11th minute. Clear patterns of posi- 
tively accelerated responding developed during the 
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Fig. 12. Performances under an interlocking schedule of post- 
ponement of electric shocks (upper frame) and a FI 5-min 
schedule of presentation of electric shock (middle and bottom 
frames) (monkey 5-67). Short diagonal strokes on both cumula- 
tive and event records indicate 3-mA shock presentations. The 
pattern of positively accelerated responding became more 
marked when response-produced shocks occurred under the FI 
schedule. (Modified from Morse & Kelleher, 1970.) 


first 10 min of each cycle, whereas responding during 
the 11th minute of each cycle remained almost com- 
pletely suppressed (see Figure 1). Studies of variations 
in shock intensity showed that the mean number of 
responses per session increased from 1,548 at a shock 
intensity of 1 mA to 4,227 at a shock intensity of 12.7 
mA. During the entire study, responding in the 11th 
minute of each cycle was completely suppressed. 
When the time-out period was eliminated so that 
each 1l-min cycle was followed immediately by the 
start of the next cycle, performance was affected (Fig- 
ure 13). An increase in responding during the early 
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part of some cycles was a transient effect. Responding 
under the l-response FR component in the 11th min- 
ute increased gradually and stabilized at a higher rate 
than had been maintained with the time-out; this 
resulted in a three- to fourfold increase in the num- 
ber of shocks delivered. Thus the effects of electric 
shock in suppressing responding during the FR com- 
ponent were more pronounced when a time-out period 
followed that component. When scheduled shocks 
were omitted, responding gradually decreased to near 
zero; when electric shocks were scheduled again, the 
previous performance was gradually recovered (Figure 
14). The extinction of performance under the two- 
component schedule appears to be similar to that 
occurring during extinction after FI schedules. The 
persistence of key pressing under this schedule con- 
trasts with the rapid cessation of leash pulling de- 
scribed earlier (in the section headed “Characteristics 
of Responses’’). 

Both maintenance and suppression of responding 
with response-produced electric shocks have also been 
observed under a multiple schedule (McKearney, 
1972). In squirrel monkeys previously trained under a 
schedule of electric shock postponement, characteristic 
steady rates of responding were maintained under a 
VI 3-min schedule of electric shock presentation in the 
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Fig. 13. Performance under a 
two-component FI 10-min FR 1 
schedule of electric shock pre- 
sentation without a_ time-out 
period separating 11-min cycles. 
Shock presentations are indi- 
cated by a diagonal stroke on 
cumulative and event records; 
the termination of one cycle 
(and the beginning of the next 
cycle) is indicated by the record- 
ing pen resetting to the base 
line. A-C—Sessions 186, 187, and 
194. Note that there was less 
suppression during the FR 1 
component when it was not 
followed by a time-out period 
than when it was (Figure 1). 
(From Kelleher & Morse, 19682. 
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presence of a visual stimulus. Then an FR 1 schedule 
of electric shock presentation was in effect during 
certain l- or 3-min periods associated with a different 
stimulus. Although the parameters of electric shock 
were identical in the two components of the multiple 
schedule, rates of responding were well maintained 
under the VI schedule but suppressed under the FR 
schedule. 

The experiments by Kelleher and Morse (1968a) 
and by McKearney (1972) emphasize the importance 
of the schedule of electric shock presentation because 
identical electric shocks had opposite effects on re- 
sponding under two different schedules. Responding 
was maintained by electric shocks presented under an 
FI 10-min schedule or a VI 3-min schedule and sup- 
pressed by electric shocks presented under an FR | 
schedule. The schedule of electric shock delivery de- 
termined whether its effects were characteristic of rein- 
forcement or of punishment. 

The effects of events that modulate behavior de- 
pend not only on the nature of the events and the 
schedule under which they are presented but also 
upon the experimental history of the individual. ‘The 
historical determination of behavior does not neces- 
sarily imply any lack of modifiability. Although the 
conditions under which electric shocks came to con- 
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trol behavior in the examples above were complex 
and depended on history, the maintained performances 
were under the control of the prevailing schedule of 
shock presentation. 

A final example showing how historically deter- 
mined behavior is modulated by current conditions is 
provided by a study in which a squirrel monkey had 
been trained to press one of two keys (key R) under a 
schedule of electric shock postponement, and respond- 
ing on this key was then maintained under an FI 
d-min schedule of electric shock presentation. Occa- 


sional responses occurred on the other key (key L) in. 


every experimental session, although they had no pro- 
grammed consequences. When the contingencies were 
reversed so that the FI schedule of electric shock pre- 


DETERMINANTS OF REINFORCEMENT AND PUNISHMENT 


$43 


Fig. 14. The extinction and re- 
development of performance 
under the two-component muit 
FI 10-min FR 1 schedule of 
electric shock presentation with- 
out time-out periods. Recording 
as in Figure 13. A, B—Sessions 
195 and 196, on extinction; C, 
D—Sessions 197 and 199, on a 
two-component shock schedule. 
(From Kelleher & Morse, 1968a. 
© 1968 by the Society for the 
Experimental Analysis of Be- 
havior, Inc.) 


sentation was in effect on key L and responding on key 
R had no programmed consequences, a period of 
transition followed. Responding on key R declined 
while responding on key L increased and became posi- 
tively accelerated. Eventually, a characteristic FI pat- 
tern of responding was maintained on key L while 
low levels of responding occurred on key R (Figure 
15). ‘The changed contingency “shaped” a pattern of 
responding on key L. This example is important in 
showing that historically determined performances 
maintained by a schedule of electric shock presenta- 
tion are not simply the temporal modulation of a 
highly stereotyped response pattern. The FI pattern of 
responding occurred on the key associated with the 
schedule of electric shock presentation. 
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Fig. 15. Effects of changing the response key on which presen- 
tations of electric shock are scheduled under FI 5-min. Ordinate 
—cumulative number of key presses on key that produced 
electric shocks; abscissa—time. The recording pen reset to the 
base line with the presentation of electric shock and the be- 
ginning of a 1-min time-out period. Short diagonal strokes on 
the cumulative record indicate the end of the time-out: short 
diagonal strokes on the event record indicate key presses on the 
key that did not produce electric shocks. A—stable performance 
under FI schedule of shock presentation programmed on key R; 
B, C, D—Sessions 3, 18, and 55 under the FI schedule of shock 
presentation programmed on key L. The average rate of re- 
sponding on key L gradually increased, while that on key R 
decreased after the contingency was changed. (Kelleher & Morse, 
unpublished observations.) 


CONCLUSIONS 


Valid concepts applicable to the scientific study of 
behavior evolved from discovering and controlling the 
determinants of orderly changes in responding. Im- 
portant determinants of reinforcement and _ punish- 
ment are the parameters of consequent events, the 
quantitative properties of ongoing behavior, and the 
ways consequent events are scheduled. The scheduling 
of relations between behavior and consequent events 
brings diverse factors into operation in time as a 
dynamic coherent complex. The notions of schedule 
and of schedule-controlled behavior conveniently char- 
acterize the sequential interaction between behavior 
and environment. 
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Schedule control is the single most important prop- 
erty of operant behavior. Partly because of the con- 
ception of schedules as variations of a basic process of 
reinforcement rather than as the actual determinants 
of behavior, it has only slowly been appreciated that 
schedule-controlled behavior can determine the effects 
of consequent events. That the schedule of presenta- 
tion of an event should determine the effect of the 
event is unexpected from traditional formulations; 
that it occurs suggests that traditional terms and time 
scales may be inappropriate. Diverse conditions will 
each result in characteristic reproducible and orderly 
behavioral performances. Even though it may not 
seem so at a superficial level, the discovery of the de- 
terminants of these diverse conditions gives a strong 
basis for generalizing about behavior. Reinforcement 
and punishment are best considered as reproducible 
behavioral processes. 

Some consequent events that maintain behavior are 
especially forcing at particular parameter values, so 
that past experience is of little consequence; except 
when there is some ongoing behavior, certain bland 
events are relatively ineffective and other snappy 
events are likely to suppress behavior. Many of the 
activities that people engage in, such as growing 
peonies, sailing boats, and riding motorcycles, may 
tell more about the history of the individual (or his 
affluence) than can anything inherent in the activities 
themselves. Some ongoing behavior or past experience 
may be important in the development of behavior but 
not in its continued maintenance. For example, teach- 
ing programs shape behavior in a graded way; how- 
ever, when the final level of competence is reached, 
the behavior at that time is no longer so critically 
dependent on slight gradations. Various devices and 
techniques are often used in initial development of 
lever pressing that are usually of no consequence after 
performances are well maintained. Traditionally, most 
experimental studies of reinforcement or punishment 
have used preemptive consequent events under con- 
ditions in which prior experience was not a critical 
determinant. Food presented to a highly deprived an- 
imal or a strong electric shock are immediately pre- 
emptive. Such consequent events are no better rein- 
forcers than those that do depend upon history. In 
nonexperimental situations most behavior is main- 
tained under conditions where history is important. 
One man is a lawyer, another a doctor; each “likes” 
his work and each is maintained by the environment. 
There is nothing about torts or about warts that is 
interesting to everyone. It is only under certain special 
circumstances that environmental consequences are 
especially forcing in engendering behavior. Even then, 
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the conditions necessary for the development of be- 
havior are more critical than the conditions necessary 
for its maintenance, implying that as soon as behavior 
develops, its history becomes important. In experi- 
mental situations behavior is often developed under 
forcing conditions where history is of little impor- 
tance, but in nonexperimental situations most be- 
havior is shaped from already existing behavior under 
conditions where the shaping sequence is important. 
How could it be otherwise? People are behaving all 
the time, and their behavior blends with the condi- 
tions of the current enyironment. In contrast, in the 
laboratory it is usually arranged so that there is no 
strong engeing behayier belore it 19 developed. 

In recent years, it has become possible cxpcerimen- 
tally to study situatiotis bf Weréasing complexity and 
to extend the range of conditions sufficient to engen- 
dey behavior into the domain of historical determi- 
nants: While seme may view this as a messy pollution 
of well-established logical definitions, it brings the 
range of phenomena studied in laboratory settings 


closer to those of ordinary behavior. The concepts of 
historically determined and schedule-controlled be- 


havior may seem unfamiliar and less precise than tra- 
ditional formulations, but they are more valid. 
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Schedules of Reinforcement 


the controlling variables’ 


INTRODUCTION 


Schedules of reinforcement are among the most 
powerful determinants of behavior. The effects of 
each type of schedule are systematic and orderly in 
individual organisms, and they are replicable within 
and across species (for an example see Skinner, 1959, 
p. 374, Figure 14). The particular performance gen- 
erated depends on the schedule used, but each sched- 
ule has characteristic effects. In fact, one way to evalu- 
ate the adequacy of experimental control is by seeing 
if the behavior typical of specific schedules is repro- 
duced (Sidman, 1960). Failure to obtain the expected 
performances indicates deficiency in the experimental 
laboratory. 

Some psychologists have considered research on 
reinforcement schedules to be atheoretical, yet start- 
ing with Skinner (1938) and continuing with Ferster 
and Skinner (1957) and Morse (1966) there has been 
concern with theoretical analysis. Perhaps the perva- 
siveness of theory has not been generally recognized 


* Preparation of this chapter and several of the experiments 
reported were supported by Research Grant GB-25959 from the 
National Science Foundation. I would like to thank M. J. Marr, 
W. H. Morse, and E. Davis for their comments. 
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because the data have been so powerful that they de- 
manded and received major attention, while the more 
conjectural theoretical efforts assumed secondary 
status. Typically, theory has not been formal or quan- 
titative; it has been at a lower level, consisting of 
hypotheses about the essential controlling relations 
(but see Schoenfeld, Cole, Blaustein, Lachter, Martin, 
& Vickery, 1972, for a more formal taxonomic ap- 
proach). The present purpose is to offer another such 
analysis. 

The obvious origins of schedule research appear 
in Skinner’s (1938) demonstration that a reinforcer 
does not have to follow every response in order to 
maintain responding, but that it need only occur 
intermittently. The importance of intermittent rein- 
forcement eventually might have become apparent 
wtih discrete-trial procedures, but it happened im- 
mediately within Skinner’s free operant paradigm. In 
research in which each response terminated a trial the 
focus tended to be on resistance to extinction under 
low-valued ratio schedules (the partial-reinforcement 
effect); in contrast, the free operant experiments em- 
phasized the nature of performance under maintained 
reinforcement. The publication in 1957 of Ferster and 
Skinner’s Schedules of Reinforcement represents the 
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beginning of the modern era of schedule research. 
This encyclopedia of schedules not only describes the 
performances occurring under many simple and com- 
pound schedules, but also pioneered in treating sched- 
ules as a distinct subject matter. 

It became possible to use schedule performance to 
study the effects of other variables; for example, in 
behavioral pharmacology schedules provided a foun- 
dation for assessing the actions of drugs (see Harvey, 
1971). But it soon became evident that schedules did 
more than establish reliable and recoverable base 
lines. The schedule itself played an important role in 
determining how the variables of primary interest op- 
erated. It served not just as a convenient vehicle for 
observing other processes at work but, for example, 
could determine whether a certain dosage of a given 
drug increased, decreased, or had no effect on the rate 
of responding (cf. Kellehcr & Morsc, 1968). Dews 
(1965) concluded: 


Schedule-controiied behavior does not merely 
provide a baseline for convenient study of other 
variables; it ig itself close to the heart of the 
matter, ‘his emphasis on the importance of 
schedules is not intended to imply that all of 
psychology should be reduced to a study of them. 
An influence can be all-pervading without being 
all-embracing. . . . It is suggested that schedule 
influences operate generally in psychology; that 
when these influences can operate, they will; and 
that a student of any problem in psychology—in 
motiyation, generalization, discrimination, or 
the functions of the frontal lobes—ignores the 
consequences of the precise scheduling arrange- 
ments of his experiments at his peril. (p. 148) 


The ubiquity of schedule effects means that an un- 
derstanding of how the scheduling of reinforcers de- 
termines performance is of fundamental significance. 
Intermittent reinforcement organizes and maintains 
highly predictable extended sequences of behavior, 
and it also determines the effects of many other vart- 
ables. ‘he present chapter is an effort to describe how 
intermittent reinforcement operates to control be- 
havior. 


TYPES OF SCHEDULES 


The word reinforcement refers to the effect of an 
operation; it does not describe an independent vari- 
able but is the interaction of an independent variable 
with behavior. By reinforcement is meant an increase 
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in responding as a function of a stimulus event fol- 
lowing the response. The stimuli having these effects 
are reinforcing stamult or reinforcers. Schedules of 
reinforcement are the rules used to present reinforcing 
stimuli. 


Time and Response Schedules 


The most widely used schedules are defined in 
terms of time and responses. They may or may not 
require a particular response. All response-indepen- 
dent schedules are time schedules, and they are re- 
ferred to as fixed-time (FT) or variable-time (VT) 
schedules depending on whether the interreinforcer 
time is fixed or changes from one reinforcer presen- 
tation to the next, Other schedules are response- 
dependent. Of these, the ones that only require re- 
sponses are ratio Schedules, and they are fixed-ratio 
(FR) or variable-ratio (VR) schedules depending on 
whether a fixed or variable number of responses is 
required. Interval schedules involve both response and 
temporal requirements, but they will be treated here 
as simple schedules. Interval schedules combine time 
schedules and a fixed-ratio schedule (FR 1): the first 
response emitted after a specific time has elapsed pro- 
duces the reinforcer, and earlier responses have no 
scheduled consequences. In fixed-interval (FI) sched- 
ules the time is constant; in variable-interval (VI) 
schedules it varies. In VT, VI, and VR schedules the 
experimenter determines the precise sequence of in- 
terreinforcer times (VI, VI) or responses per rein- 
forcer (VR). There also are schedules that provide 
reinforcer presentation after irregular time periods or 
irregular numbers of responses, but the precise se- 
quences are not prespecified. Instead, each time period 
or response is equally eligible for reinforcement ac- 
cording to some probability. These schedules are 
known as random-time (RT), random-vatio (RR), and 
random-interval (RI) schedules depending on whether 
the probability of a reinforcer occurring refers to time 
alone, to responses alone, or to a response occurring at 
a certain time. 

Response-independent schedules are here referred 
to as time schedules, despite earlier references to them 
as interval schedules preceded by an appropriate 
qualifying adjective (e.g., “free,” “response-indepen- 
dent,” “noncontingent’’). Since interval schedules by 
definition require a response, to use them to refer to 
a response-independent arrangement is internally in- 
consistent and misleading. The time schedule designa- 
tion avoids this ambiguity. 
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Differentiation Schedules 


In differentiation schedules reinforcers are pre- 
sented when a response or a group of responses dis- 
plays a specified property. For example, responses 
might have to be emitted with a particular force, 
duration, or form (topography) or to occur in a cer- 
tain locus. Differentiation schedules are involved in 
shaping new responses, but they also encompass cer- 
tain unchanging requirements. IJnterresponse-time 
(IRT) schedules establish the time between successive 
responses as the requirement. If the time must equal 
or exceed the specified value, this is an IR'T > t¢ sched- 
ule. If the response must occur before a specified time 
period elapses, this is an IRT <1? schedule, If the 
reinforcer is presented whenever a specified response 
has not occurred for a certain time period, this is an 
R >t schedule. The IRT >i, IRT <t, and R>t 
schedules all are differentiation schedules involving 
intervals between responses as a prerequisite for rein- 
forcement. Since in an R > ¢ schedule, not emitting a 
certain response is treated as if it was a response, rein- 
forcement is manifested by a decreased frequency of 
the criterion response. 

The IRT >t, IRT < t, and R > ¢t designations re- 
place DRL, DRH, and DRO. The problem with the 
old usage is that it confused a theoretical account of 
the effects of the schedules (differential-reinforcement- 
of-low-rate, difterential-reinforcement-of-high-rate, ait- 
ferential-reinforcement-of-other [or not-] responding) 
with the simple description of the prescription for 
reinforcer delivery. 


i-r Schedules 


Schoenfeld and his colleagues (Schoenfeld et al., 
1972) have devised schedules based on temporal pa- 
rameters combined with varying probability of rein- 
forcement for single responses. The probability of 
reinforcer presentation occurring in any part of a re- 
peating time cycle can be varied between 0.00 and 
1.00, either dependent on or independent of a re- 
sponse. If the cycle duration is fixed and the probabil- 
ity of a reinforcer following the first response of a 
cycle is 1.00, it is equivalent to an FI schedule; if the 
probability of a reinforcer for the first response of a 
cycle is less than 1.00, it is an RI schedule. If the 
probability of reinforcement occurring for all re- 
sponses is greater than 0.00 but less than 1.00, this is 
an RR schedule. These are limiting cases. Combina- 
tions of cycle lengths, periods of reinforcer availabil- 
ity, and probabilities of reinforcement for each re- 
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sponse can be manipulated to generate numerous 
schedules. 

Although Schoenfeld et al. (1972) have proposed 
t-7 schedules as a comprehensive schedule classifica- 
tion system, it does not incorporate fixed- and variable- 
ratio schedules directly. Response count does not enter 
into the specification of a schedule; probability of 
reinforcement is applied only to individual responses. 
Although performance typical of ratio schedules can 
be obtained by appropriate manipulation of the 
temporal parameters, this does not mean that response 
count is an irrelevant independent variable. Schedules 
still can be specified based on fixed and variable num- 
bers of responses without reference to temporal param- 
eters. Similar performances generated by ratio and t-7 
schedules pose the challenge of finding characteristics 
common to both types. The t—r schedules do not in 
themselves explain ratio performance. 


Extinction 


In an extinction schedule no reinforcer is pre- 
sented, Extinction is not a schedule of reinforcement, 
but it is included here to provide a comprehensive 
list of common scheduling operations. The various 
schedules described so far (time, ratio, interval, differ- 
entiation, (-7, extinction) can be combined in various 
ways to produce compound schedules. Together they 
comprise all reinforcement schedules known to date. 


TYPES OF CONTROLLING RELATIONS: 
VARIABLES AND EFFECTS 


Diract and Indirect Variablas 


A schedule states the conditions that must obtain 
for a reinforcer to be delivered. These prerequisites 
are formal properties. All schedules arrange that cer- 
tain conjunctions of events must obtain at the mo- 
ment of reinforcer presentation, although individual 
schedules differ in what these events must be. These 
formally imposed prerequisites are the direct variables 
imposed by a schedule. In ratio schedules, for exam- 
ple, presentation of the reinforcer depends on the 
execution of a certain number of responses, so that it 
is a formal requirement that this number of responses 
precede every reinforcer. 

Other variables are not imposed directly. Although 
the time between successive reinforcer presentations is 
not specified by a fixed-ratio schedule, the characteris- 
tics of performance establish a certain time period. 
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And, although a time schedule does not require that 
any particular response occur, some behavior must 
precede the reinforcer. Indirect variables are those 
that are imposed without being explicitly prescribed 
by the schedule. One problem in a theoretical analy- 
sis of reinforcement schedules is to specify these in- 
direct variables and when and how they influence 
performance. 

It appears that any variable that occurs directly 
under one type of schedule can occur indirectly under 
others. For example, the time separating the rein- 
forced response from the one preceding—the inter- 
response time (IR’'T)—is imposed at a specific value 
under IR'T’>¢ and IRT < ft schedules. Under any 
schedule, however, some IRT precedes the reinforcer. 
Or interreinforcer time, which is specified directly 
under time schedules, arises indirectly under ratio 
schedules. This is not to say that the variable is neces- 
sarily exerting an effect under any schedule, but sim- 
ply that it is imposed cither directly or indirectly. 

The fact that some schedules require what others 
permit provides a methodology for an experimental 
analysis of schedule effects. The hypothesis that an 
indirect variable (e.g., interreinforcer time) has effects 
under some schedule can be cvaluated by studying 
how performance is affected when it is imposed as an 
explicit requirement (é.g., in time or interval sched- 
ules), When imposed directly it must produce the 
effects it is assumed to exert indirectly. This experi- 
mental strategy is in the tradition of Skinner (1938), 
Ferster and Skinner (1957), and Morse (1966). 


Stereotypic and Dynamie Effects 


Performance under a particular schedule is gen- 
erally uniform among different subjects and in the 
same subject over prolonged periods of time. Each 
schedule accomplishes this by arranging certain inter- 
actions among characteristics of performance and the 
controlling direct and indirect variables, 

These interactions can have two effects. The first is 
that certain characteristics of behavior may be re- 
peated in the same form in the future. The produc- 
tion of repetitive stereotyped behavior is the defining 
attribute of reinforcement: the response preceding the 
reinforcer increases in frequency. The second effect is 
dynamic: performance changes from one instance to 
the next. 


A PERVASIVE STEREOTYPIC EFFECT: 
RESPONSE DEPENDENCY 


When Skinner (1948) observed pigeons after giving 
them food every 15 seconds without regard to their 
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FI5-min 


FT 5-min 


1000 Responses 


[eT 
20 Minutes 


Fig. 1. Performance of a pigeon under FI 5-min and FT 5-min 
schedules. The response pen offset at food presentations. (From 
fLeiler, 1968.) 


behavior, he found that each bird performed some 
consistent ritual. It seemed that this occurred because 
a particular behavior happened to occur in close 
temporal contiguity with food presentation (in chap- 
ter 5 of this volume Staddon offers a different interpre- 
tation). This temporal relation increased the proba- 
bility of the response, even though the relation was 
adventitious. Additional research indicating many 
similarities between response-dependent and response- 
independent schedules suggests that the essential na- 
ture of the response-reinforcer relation is temporal 
(Herrnstein, 1966; Zeiler, 1972a). For example, as 
shown in Figure 1, both types of schedule can main- 
tain the responses that precede them, and both have 
similar effects on how the responses are distributed in 
time (the pattern of responding). In addition, both 
bring responding under the contro] of the exterocep- 
tive stimuli present when the reinforcer appears 
(Morse & Skinner, 1957), Such data imply that response- 
dependent reinforcer presentation increases the proba- 
bility of the response because the dependency guaran- 
tees that the effective temporal relation will occur. 
In all effective respondent-dependent schedules (ex- 
cept those involving delayed reinforcement) the speci- 
fied response occurs close in time to the reinforcing 
event. The result of this contiguity is that the response 
is Maintained at a substantial level. The precise rate 
and temporal patterning of the response are deter- 
mined by the particular schedule. The delivery of a 
reinforcer following a single response is always an im- 
portant determinant of the tendency to respond, but 
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the schedule modulates the rate of responding and 
determines how successive occurrences of the response 
are distributed in time. 


THE ASYMMETRY OF REINFORCEMENT 
AND EXTINCTION 


Herrnstein (1966) and Morse (1966) noted that be- 
havior typically is acquired rapidly and lost slowly 
(although the loss is accelerated if there are numerous 
exposures to extinction). This asymmetry means that 
in all schedules a single reinforcer presentation gen- 
erates numerous subsequent repetitions of the refer- 
ence response. It is this property of reinforcement that 
is described by Skinner’s (1938) concept of the reflex 
reserve. 

Two experiments illustrate the large effects of a 
few reinforcer presentations. Skinner (1938, pp. 86- 
90) allowed rats to adapt to the experimental chamber 
and to the sound of the food magazine. He then pre- 
sented a food pellet following one press and changed 
the schedule to extinction. As shown in Figure 2, 
more than 60 presses occurred before the response rate 
returned to the preconditioning level. Also, Neuringer 
(1970) demonstrated that pigeons given food for three 
successive key pecks emitted approximately 150 pecks 
in a subsequent extinction phase. A general effect of a 
reinforcing stimulus is to generate substantial quan- 
tities of the response that precedes it (a stereotypic 
effect). 


DyNAMIcC EFFECTS 


Some interactions between performance and con- 
trolling variables lead to change rather than to stereo- 
typy. The important factor is the level of the variable 
in question. Consider, for example, the role of inter- 
reinforcer time in fixed-ratio schedules where it is an 
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oo 
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Fig. 2. Responding in a rat produced by a single food presen- 
tation. The first response was followed by a food pellet; later 
responses had no scheduled consequences. (Traced from Skinner, 
1938, p. 87, Figure 15.) 
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Fig. 3. Performance of a pigeon under an FR 150 schedule. The 
response pen reset at each food presentation. 


indirect variable. If, for some reason, the interrein- 
forcer time should lengthen, response rate might de- 
crease. The consequence would be to increase inter- 
reinforcer time still further, thereby again reducing 
rate and producing an even longer interreinforcer 
time. Figure 3 illustrates such an effect, indicating 
that it is perhaps not felt immediately but may 
cumulate over several interreinforcer periods. Or an 
unusually short interreinforcer time might increase 
rate, thereby producing still shorter times and conse: 
quently increasing rate still further. An intermediate 
interreinforcer time, however, might not change the 
prevailing rate and therefore would recur in successive 
ratios. Variables operating in this way are said to have 
dynamic effects. Dynamic effects do not all change be- 
havior in one direction. When variables are at a high 
level, they may operate to change behavior 1n such a 
way that a low level follows. This is the way the 
number of responses emitted per reinforcer presenta- 
tion is hypothesized to operate under fixed-interval 
schedules; it will be discussed in detail in the next 
section. 

Dynamic effects can only occur when the level of a 
variable is free to change, so they are typically effects 
of indirect variables. However, if the schedule re- 
quirements were to be changed depending upon the 
characteristics of performance, it would be possible to 
observe whether a direct variable has dynamic effects. 
Adjusting schedules, which will be discussed in a 
later section, have this provision. 

Dynamic effects play an important part in deter- 
mining the frequency of responding under schedules 
of intermittent reinforcement. ‘They are particularly 
significant in schedules that maintain a high average 
number of responses per reinforcer presentations, but 
they also occur elsewhere. ‘The fixed-interval schedule, 


206 


which readily shows how these dynamic effects in- 
fluence response frequency, provides the focus of the 
next section. ‘The consideration of direct and indirect 
variables operating in fixed-interval schedules leads 
into the analysis of the determinants of response fre- 
quency under the other major schedules. 


VARIABLES DETERMINING 
RESPONSE FREQUENCY 


Although interval schedules require only a single 
response per réinforcér presentation, they maintain 
many more. At moderate and large parameter values, 
4 fived-interval schédulé will maintain a larger aver- 
age number of responses than can be maintained by 
ratio schedules. For example, Ferster and Skinner 
(1957, pp, 518-620) correlated one stimulus with an 
FI 5-min schcdulc and another with an FR 275 sched- 
ule (multiple FI 5min FR 275). Responding was 
severely strained under the fixed-ratio schedule with 
periods of 80 minutes and more occurring without a 
response. However, if the fxed-interval stimulus was 
introduced during the pauses, more than 275 re- 
sponses often were emitted within the 5-min period. 
In general, it is difficult to maintain responding with 
fixed-ratio schedules higher than FR 300 even after 
prolonged exposure to lower values. Yet an average of 
300 responses per reinforcer presentation is main- 
tained routinely with fixed-interval schedules. Figure 
4 shows Cumulative records for the same pigeon under 


PIGEON 136 


100 RESPONSES. 


40 minuTES 


Fig. 4. Cumulative records for the last session of training of a 
pigeon with FR 600 and the first session with the FI 40-min 
schedule. The response pen reset at 1100 responses and with 
food presentation. Offsets of the event pen on the FI record 
indicate when food became available for the next response. 
Breaks in the FR record indicate periods with no responses. 
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the last session of FR 600 and the immediately suc- 
ceeding session involving an FI 40-min schedule. ‘The 
bird was studied under FR 1000 as well. The FR 1000 
schedule (not shown) did not sustain responding, i.e., 
the bird responded infrequently during sessions as 
long as 16 hours and never completed a ratio. Re- 
sponding was maintained with FR 600, but there were 
very long pauses and many hours between successive 
food presentations. With the FI 40-min schedule, how- 
ever, there was an average of well over 1000 responses 
in each 40-min period. Ferster and Skinner (1957) 
show numerous records with several thousand re- 
sponses occurring in an interval with no sign of 
strained behavior. 

Why can a fixed-interval schedule maintain so 
many responses? An answer to this question helps to 
reveal the variables responsible for response rate 
under both interval and ratio schedules. 


Response Number in Fixed-interval Schedules: The 


Herrnstein and Morse (1958) Experiment 


An important factor in the ability of fixed-interval 
schedules to maintain a high average number of re- 
sponses per reinforcer is simply that they require only 
one. Herrnstein and Morse (1958) drew attention to 
this apparent paradox in their investigation of a con- 
junctive fixed-interval, fixed-ratio schedule. Since, in 
a conjunctive schedule, the reinforcer is delivered 
when both individual schedule requirements have 
been met, the direct effects of the schedule involved 
both the minimum number of responses specified by 
the fixed-ratio component and the minimum interrein- 
forcer time followed by a single response specified by 
the fixed-interval component. In a conjunctive FI 15- 
min FR 40 schedule, for example, a reinforcer is 
presented following the first response after 15 minutes 
if at least 39 responses have occurred earlier. Other- 
wise, the reinforcer is presented as soon after 15 min- 
utes as the 40th response is emitted. The conjunctive 
FI FR schedule imposes minimum response require- 
ments on the fixed-interval schedule, the minimum 
value depending on the parameter of the fixed-ratio 
component. (It also imposes a minimum interrein- 
forcer interval in a fixed-ratio schedule.) Herrnstein 
and Morse maintained the interval value at 15 min- 
utes and varied the ratio value from zero (a simple 
fixed-interval schedule) up to 240. 

The left panel of Figure 5 shows the average num- 
ber of responses per 15-min interval under each ratio 
requirement. As the ratio was increased, the number 
of responses per interval decreased. Both birds emitted 
close to 300 responses per interval with the simple 
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Fig. 5. The left panel shows the number of responses per 15 
min; the right panel shows the mean interreinforcer time. (Data 
from Herrnstein & Morse, 1958. @ 1958 by the Society for the 
Experimental Analysis of Behavior Inc.) 


fixed-interval schedule, but averaged 100 or less when 
food presentation required 240 responses. When 240 
responses were required, one bird took more than 4 
hours to obtain food (right panel), yet without the 
ratio requirement an average of more than 240 re- 
sponses was emitted in 15 minutes (left panel). 

These data indicate that an important factor in 
fixed-interval performance is that the schedule does 
not require more than one response. It is important, 
though, that this response be in close temporal con- 
tiguity to the reinforcer. A conjunctive FT FR 1 
schedule also requires a single response, but it does 
not guarantee that it immediately precede the rein- 
forcer. The data shown in Figure 6 corroborate 
Morgan’s (1970) and Shull’s (1970) reports that such a 
schedule maintains responding, but at a substantially 
lower level than a comparable fixed-interval schedule. 
If the response requirement is entirely eliminated by 
changing a fixed-interval to a fixed-time schedule, re- 
sponding will eventually either fall to a low level or 
cease (Herrnstein, 1966; Zeiler, 1968). The responding 
maintained by a fixed-interval schedule evidently in- 
volves something other than the simple requirement 
of a single response per reinforcer presentation and/or 
the temporal regularity of the reinforcing stimulus. 

Herrnstein and Morse attributed the high average 
frequency of responding on the fixed-interval schedule 
to the dynamic effect of the indirect variable number 
of responses per reinforcer. Consider the following 
hypothesis: The number of responses in an interval 
is determined by the number of responses in pre- 
ceding intervals. High-response intervals (many re- 
sponses per reinforcer) generate few responses in 
subsequent intervals; low-response intervals (few re- 
sponses per reinforcer) generate many responses 
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CONJUNCTIVE FR1 FT 2-MIN 
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Fig. 6. Performance of a pigeon under a conjunctive FR 1 FT 
2-min schedule compared with an FI 2-min schedule. The re- 
sponse pen reset at cach food presentation, 


subsequently. This is a dynamic effect, because the 
value obtaining at one time can produce a different 
value later which will then itself determine the next 
value and sO forth. It is an effect of an endtrect vari- 
able, because the schedule does not specify how many 
responses (beyond one) must occur. 

According ts thig hypothesis, A high frequency at 
responding in an interval is caused by preceding low- 
response intervals. The simple fixed-interval schedule 
allows as few ac one response, but the addition of a 
fixed-ratio réquirément means that there must be at 
least the number of responses specified by the ratic. 
Therefore, imposing a fixed-ratio requirement reduces 
responding by preventing a dynamic efkect responsible 
for high numbers of responses. 

The role of variation in the number of responses 
per reinforcer is evident from comparisons of variable- 
ratio and fixed-ratio schedule. Ferster and Skinner 
(1957, pp. 407-410) established responding under a VR 
360 schedule, and they then changed the schedule to 
FR 360. ‘here were more responses under the vari- 
able ratio; in fact, the fixed ratio did not always main- 
tain responding. At the same average number of re- 
sponses per reinforcer, therefore, variable numbers can 
maintain more responses than fixed numbers. 

The maximum number of responses per reinforcer 
can be restricted without affecting the ability of a 
schedule to sustain responding. Neuringer and 
Schneider (1968) used an FI 30-sec schedule in which 
each response prior to the last produced a blackout 
(an intertrial interval). By varying the duration of the 
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Fig. 7. Latenéy in seconds of the pigeons’ first response aftcr 
food presentation (filled points) and the time between successive 
subscqucnt responses (open points). (Left) Maximum number 
of responses per reinforcer on an FI 30-sec schedule. (Right) 
Minimum interreinforcer timc on an FR 15 schedule. (Data 
from Neuringer & Schneider, 1068.) 


blackout and measuring the interval in real time (by 
adding the blackout durations to the time spent re- 
sponding), they restricted the total number of re- 
sponses that could occur. For example, if the blackout 
duration was 4.96 sec, no more than six responses 
could occur. As in the ordinary fixed-interval schedule, 
the reinforcer could follow a single response. ‘The left 
panel of Figure 7 shows that blackout durations rang- 
ing from .34 sec (maximum of 88 responses per rein- 
forcer) to 4.96 sec (maximum of six responses) did not 
change the latency of each response. To the extent 
that this discrete-trial procédure involying intertrial 
intervals of different durations is related to the typical 
fixed-interval schedule, it shows that behavior is un- 
affected by restricting the maximum number of re- 
sponses. ‘he important factor, as shown by Herrn- 
stein and Morse (1958), is that the possibility of few 
responses be preserved. 


Cyclicities in Responding 


Skinner (1938, pp. 123-126) found that responding 
under fixed-interval schedules varied in four ways. 
There were oscillations in the number of responses 
per session (first-order deviations); response frequency 
changed from one interval to the next in each ses- 
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sion (second-order deviations); response rate changed 
within individual intervals (third-order deviations); 
individual responses tended to occur in groups (fourth- 
order deviations). First-order deviations have not re- 
ceived attention subsequently, while fourth-order 
deviations may occur with all schedules (Blough, 1963; 
Skinner, 1938). The distinctive characteristics of fixed- 
interval performance are the second- and third-order 
deviations. The third-order deviation—the pattern of 
responding or the distribution of responses in the 
time between successive reinforcer presentations—is a 
most important characteristic of different schedules 
and will be treated separately. The second-order devia- 
tions—the varying number of responses per interval— 
are of main concern now. Other investigators have 
also found that this deviation remains after extended 
exposure to fixed-interval schedules (Cumming & 
Schoenfeld, 1958; Dews, 1970). A satisfactory explana- 
tion of fixed-interval performance must explain the 
variability in response number per interval. This 
variability is illustrated in Figure 8. 

Dews (1970) has shown that under an FI 3-min 
schedule the number of responses in an interval can 
vary over nearly a 50-fold range. In analyzing the rela- 
tions among the number of responses emitted in 200 
consecutive 3-min fixed intervals, Dews found two 
interesting phenomena. The first was shown by classi- 
fying intervals in terms of which of six class intervals 
described the number of responses. There was a gen- 
eral tendency for intervals with many responses to 
follow intervals with many responses. After an un- 
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Fig. 8. Performance of a pigeon under an FI 15-min schedule. 
The response pen reset at each food presentation. Offsets of the 
event pen indicate when the 15-min interval timed out. The 
numerals adjacent to each interval indicate the number of 
responses in the interval to the nearest five responses as mea- 
sured from the record. 
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predictable number of high-response intervals, one or 
more low-response intervals occur and the cycle re- 
peats. ‘There may also be a series of intervals having 
about the mean number of responses per interval. As 
Dews says, there is “irregular periodicity that was seen 
as a waxing and waning of the prevailing numbers of 
responses in sequence of intervals” (p. 59). It seems 
evident, therefore, that the relations controlling the 
number of responses in successive intervals may not 
operate immediately from one interval to the next but 
instead are cumulative effects over at least several 
intervals. 

A second kind of cyclicity was revealed by ignoring 
the absolute number of responses and considering 
only the direction of change from one interval to the 
next. Dews found an alternation pattern in which 
intervals tended to be preceded and followed by inter- 
vals having more responses. “A second-order effect, 
alternation, was occurring during the session to a 
slight degree but . . . quantitatively this effect was 
small (and as a matter of fact, inconsistent from sub- 
ject to subject)” (p. 58). Shull’s (1971) data on sequen- 
tial relations among initial pause durations can be 
interpreted as also showing a small alternation ten- 
dency on the assumption that pause duration and re- 
sponse number covary. Randolph and Sewell (1968) 
found that low-response intervals do tend to be fol- 
lowed by intervals with many responses, but that there 
is not an equally strong tendency for many responses 
to be followed by few. In general, then, there is a leng- 
term effect of number of responses per reinforcer felt 
over several intervals and a more immediate but 
smaller effect seen from one interval to the next. 

Herrnstein and Morse’s (1958) explanation of fixed- 
interval responding in terms of the dynamic effects of 
the number of responses per reinforcer suggests that 
there should be oscillation in response number in suc- 
cessive intervals. In fact, the hypothesis that response 
number has dynamic effects seems to originate in Skin- 
ner’s (1938) and Ferster and Skinner’s (1957) attempts 
to explain the cyclicities. According to these views, 
intervals containing few responses generate intervals 
containing many responses, and these in turn generate 
few responses which then produce many responses and 
the cycle repeats. Apparently, the effects of a high re- 
sponse interval are not immediately to generate few 
responses in the next interval. Instead, effects seem to 
reveal an accumulation over several successive 
intervals. 

In fixed-interval schedules, number of responses per 
reinforcer operates indirectly and its role must be in- 
ferred. However, fixed-ratio schedules establish num- 
ber as an explicit independent variable and thereby 
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disclose its effects on behavior. What happens when 
number of responses per reinforcer is manipulated 
directly? 


Number of Responses and Fixed-Ratio Performance 


Felton and Lyon (1966) and Powell (1968) extended 
Ferster and Skinner’s (1957) investigations of the 
effects on pigeons of varying the fixed-ratio value. 
Both found that the duration of the initial pause in- 
creased as the ratio increased. In some subjects the 
postpause rate decreased with increases in the ratio, 
and in others the changes were less clear. ‘Thus over- 
all response rate (total responses divided by pause 
time plus the time spent responding) depends on, but 
is not linearly related to, the number of responses per 
reinforcer. 

This relation corresponds to that hypothesized to 
account for response number fluctuations with fixed- 
interval schedules and to the results reported by 
Herrnstein and Morse (1958) with conjunctive FI FR 
schedules. Here, too, a small number of responses per 
reinforcer generated subsequent high average rates 
and many responses generated lowér average rates. 
Since the hypothesized indirect effects were consistent 
with those found when the variable was imposed 
directly, the conclusion that the number of responses 
per reinforcer operates as an indirect determinant 
gains plausibility, 

The fixed-ratio schedule dees net allow the number 
of responses to have dynamic effects, because the num: 
ber is held constant. Instead, it reveals how behavior 
is related te response number, With sufficiently high 
constant numbers (high ratios), a point is reached at 
which overall response rate drops and responding is 
poorly maintained in the absence of special histories. 
‘The precise nature of these histories is not well under= 
stood. 


Interreinforcer Time 


In general, at values under which fixed-ratio sched- 
ules maintain responding readily, the responses are 
emitted at higher rates than occur with fixed-inter- 
val schedules generating about the same average num- 
ber of responses per reinforcer. ‘This conclusion is de- 
rived from Ferster and Skinner’s (1957) separate 
experiments involving fixed-ratio and fixed-interval 
schedules. If this is in fact the case, the number of 
responses per reinforcer cannot be the sole determi- 
nant of response rate, because response number can 
be the same under both types of schedule while rate 
varies. 
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The reason that equal responses per reinforcer can 
generate higher rates on ratio than on interval sched- 
ules lies in the relation of responding to the time be- 
tween reinforcer presentations (interreinforcer time). 
The interreinforcer time is a function of response rate 
with ratio schedules, but its minimum is specified by 
the parameter value of interval schedules. When in- 
terreinforcer time is controlled on ratio schedules, re- 
sponse rate 1s affected markedly. Consider again 
Herrnstein and Morse’s (1958) conjunctive FI FR 
schedules, but this time from the point of view of the 
ratio component. Interreinforcer time is not free to 
decrease to less than 15 min because of the FI 15-min 
requirement. Gonsequently, the minimum value is 
controlled. Also consider that Felton and Lyon (1966) 
found that under simple FR 150 schedule birds aver- 
aged somewhat more than 200 seconds to emit 150 re- 
sponses. Herrnctein and Morse, in contrast, found 
that with a cenyunctive FI 15-min FR 120 schedule 
the shortest average interreinforcer time was more 
than 30 min (Figure 5, right panel). The importance 
of leaving the minimum interreinfercer time uncon- 
trolied to obtain high rates under ratio schedules 
saams avidant. Dhis wag alsa shown by Neuringer and 
Schneider (1968). They used an FR 15 schedule in 
which cach response before the last produced a black- 
out period, while the fixed-ratio schedule held the 
numbér Sf rasponses constant. The right panel of 
Figure 7 shows that the longer the interreinforcer 
time (the longer the blackouts), the longer was the 
latency of each response. These data are further evi= 
dence that the relation between response and rein- 
foréér ratés ig SHE Féassni for the high rates undér ratis 
schedules. 

Variable-ratio schedules of moderate value main- 
tain a high and fairly constant rate of responding 
(Ferster & Skinner, 1957, Chapter 7). AS in the case of 
fixed-ratio schedules, the high rates depend on the 
circular rélation bétween response rate and interrein- 
forcement time. This was shown by Ferster and Skin- 
ner (pp. 399-407) in the following way. 

Pigeons were matched in their stable response rates 
under a variable-interval (VI 5-min) schedule. Then 
one bird was changed to a variable-ratio schedule in 
which the number of responses per reinforcer was 
chosen to match the number emitted on the variable- 
interval schedule. ‘The second bird was yoked to the 
first. When the first received food, the second obtained 
food for its next response. No other responses were 
required for the second bird. This generated a vari- 
able-interval schedule for the second bird, with inter- 
reinforcer time under the VI schedule matching that 
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achieved for the first bird under VR. For one pair of 
birds the variable-ratio schedule generated two- to 
threefold higher rates than did the yoked variable- 
interval schedule. Interreinforcer time could not be 
responsible for the difference since it was the same; 
instead, the important factor had to be the depen- 
dence of interreinforcer time on response rate for the 
ratio bird. 

A second effect occurred for another pair of birds. 
Responding could not be maintained under the vari- 
able-ratio schedule after the change from the initial 
variable interval. That one pair of birds revealed a 
higher rate under VR than VI while the other stopped 
responding under VR is not contradictory. When re- 
sponding is well maintained by a ratio schedule, the 
rate 1s higher than with an interval schedule gen- 
erating comparable numbers of responses per rein- 
forcer; however, interval schedules will maintain a 
number of responses per reinforcer that cannot be sus- 
tained by ratio schedules. 

Responding in interval and ratio schedules, in sum- 
mary, reflects the opcration of the number of re- 
sponses per reinforcer and the interreinforcer time. 
The response number characteristics of interval sched- 
ules arise from the dynamic role of varying numbers 
of responses per reinforcer combined with constant 
interreinforcer times: the rate characteristics of ratio 
schedules arise from constant numbers of responses 
per reinforcer combined with behavior-dependent in- 
terreinforcer times. The parameters of each of these 
schedules are important in determining the perfor- 
mance éngéendéréd, bécause they establish the levels of 
the variables either directly or indirectly, 


Tandem Schedules with Fl and FR Campenents 


The present analysis cannot explain why tandem 
FIT FR schedules should have the effects that they 
sometimes do. In such a schedule, completion of a 
fxed-interval requirement initiates the ratio require- 
ment, and completion of the ratio then produces the 
reinforcer. There are no stimulus changes correlated 
with the different schedule components. Ferster and 
Skinner (1957, pp. 416-422) found that a tandem FI 
45-min FR 10 schedule increased response rate above 
that occurring with an FI 45-min schedule. Note that 
the tandem schedule places restrictions on both the 
minimum interreinforcer time and the minimum 
number of responses per reinforcer. The rate increase, 
therefore, is unexpected. Ferster and Skinner (p. 429) 
also showed that responding was maintained when the 
ratio component was increased to FR 400. Parametric 
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analyses are necessary to show whether rate increases 
are general effects or occur only when the interval 
component is so large that it maintains a high average 
number of responses per reinforcer by itself. 


The Relation Between Responses Per Reinforcer 


and Interreinforcer Time 


Experiments studying the effect of fixed-ratio size 
on response rate consistently have confounded the 
direct effects of responses per reinforcer with the in- 
direct effects of interreinforcer time. As ratio size was 
increased, the birds took more time to obtain each 
reinforcer. ‘Therefore, increased interreinforcer time 
rather than, or in addition to, increased response num- 
ber may have produced the lower overall rate. Changes 
in both variables occurred in Herrnstein and Morse’s 
(1958) experiment as well. Under the conjunctive FI 
FR schedule any time the ratio requirement had not 
been met by the time the 15-min interval had elapsed, 
interreinforcer time increased. The right panel of 
Figure 5 shows that interreinforcer time did increase 
as a function of the size of the ratio, and the left panel 
shows that response rate decreased accordingly. The 
effects observed by Herrnstein and Morse may have 
been due to responses per reinforcer, to interreinforcer 
time, or to some combination of the two. 

Neuringer and Schneider’s (1968) attempt to sepa- 
rate the two variables was only partially successful. 
When they maintained an FR 15 schedule while vary- 
ing interreinforcer time, they found that, as the time 
increased, response latency (time to the first response) 
increased as well (Figure 7, rrght panel). This demon- 
strated the role of interreinforcer time independent of 
the numbers of responses. Further, when they con- 
trolled interreinforcer time by establishing an FI 30- 
sec schedule while limiting the number of responses 
that could be emitted, latency was the same inde- 
pendent of response number (Figure 7, left panel). 
Thus they demonstrated that restricting the maximum 
number of responses had no significant effect on re- 
sponding. ‘They did not, however, control how few 
responses could occur, i.e., the minimum number of 
responses per reinforcer. Since the minimum number 
was unrestricted under all conditions, there was no 
interference with the property now hypothesized to 
induce high rates, and consequently no effect was 
observed. 

An experiment by Crossman, Heaps, Nunes, and 
Alferink (1974) showed that minimum number of re- 
sponses exerted effects with interreinforcer time con- 
trolled. ‘They arranged a multiple schedule in which 
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the first component was a ratio of FR 25, FR 50, or 
FR 100 and the second involved an FR 2 schedule 
with the 2 responses separated by a blackout. A com- 
puter recorded the time spent in the first component 
(interreinforcer interval). It then also recorded the 
latency of the first response of the second component 
and adjusted the duration of the blackout so that it 
ended when the latency plus the blackout equaled the 
interreinforcer interval of the first component. The 
next response produced food. The result was that the 
interreinforcer time was the same in both components, 
but one required 25, 50, or 100 responses whereas the 
other required 2. 

Interreinforcer time in the first component in- 
creased as the ratio was raised. Figure 9 shows that the 
latency to the first response of both components in- 
creased as well. Since the response requirement was 
constant at two in the second component, interrein- 
forcer time clearly was relevant. In addition, however, 
the latency was always shorter under the 2-response 
requirement than with the 25-, 50-, or 100-response 
requirement even though interreinforcer times were 
matched in both. In other words, response number 
was important independent of interreinforcer time. 
Although the response-produced blackouts in this and 
Neuringer and Schneider’s experiment probably 
exerted distinctive effects that themselves still need to 
be isolated, the experiments do suggest that both in- 
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Fig. 9. Latency of the first response on FR 25, FR 50, and FR 
100 schedules and on FR 2 schedules equated for interreinforcer 
time. (Data from Crossman et al, 1974.) 
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terreinforcer time and minimum response require- 
ment are important independent sources of control. 


The Regenerating Power of Interval Schedules 


The interval schedule tends to have the number of 
responses per reinforcer regress toward the mean. 
When responding is strong and many responses occur, 
the number of responses per reinforcer becomes high, 
and responses subsequently occur Jess frequently. 
When responding is weak, this means few responses 
per reinforcer and a subsequent high response rate. 
‘The occurrence of 9 reinforcer when the tendency to 
respond is low may have another related cffect: should 
the organism stop responding for a time exceeding the 
unterval parameter yalue, the next response will be 
followcd by the reinforcer and responscs will be re 
ganarated. 

Ratio schedules de not share this regenerating char- 
ACLEYIStiC: NO Matter how much time elapses without a 
response, a reinforcer will not occur following the 
next instance, unless it just happens that the next 
one ands the ratio. Because ratio schedules do not 
provide a reinforcer just when it is necded to revive 
weak behavior, they are usually involved when a re- 
sponse 1S poorly maintained (Morse, 1966, Pp. 86). 

If this regenerating potential is important. a vari- 
able-ratio schedule that matched a high-valued fixed- 
interval schedule in the sequence of response numbers 
per reinforcer would not maintain responding as well 
as the fixéd interval. The variable-ratio schedule does 
not automatically adjust the response requirement 
downward when performance weakens. Since a cor- 
respondence between weak behavior and a low ratio 
would be entirely fortuitous, the variable-ratio sched- 
ule lacks a perfectly tuned regenerating characteristic. 
The only relevant information appears in Ferster and 
Skinner's (1957) commeént about variable-ratio sched- 
ules: ‘‘As in all schedules requiring a number of re- 
sponses, the bird will stop responding altogether if 
the average number goes beyond a certain value’ (p. 
391). In contrast, there is no apparent limit to how 
high an average number of responses can be achieved 
with interval schedules. If higher rates should occur 
on the variable-ratio schedule, however, it would sug- 
gest that the regenerating potential is not important. 

The IRT>¢ schedule has the same built-in regen- 
erating property as interval schedules. Under IRT>t 
schedules, whenever responding becomes weak enough 
that the organism pauses beyond the parameter value, 
the next response is followed by the reinforcer. As 
would be expected, therefore, these schedules also 
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maintain many responses (e.g., Herrnstein & Morse, 


1957). 


ADJUSTING FIXED-RATIO SCHEDULES 


Research involving adjusting fixed-ratio schedules 
further supports the hypothesis that schedules will 
maintain a large average number of responses if they 
provide a reinforcer whenever responding weakens. 
In an adjusting fixed-ratio schedule, the ratio require- 
ment is changed based on some behavioral criterion. 
For cxample, Ferster and Skinner (1957, pp. 718-720) 
imposed an’ FR 160 schedule. Whenever the bird 
paused for 2 min or more before making the first re- 
sponse, the schedule became FR I. This meant that 
wheneyer the FR 160 schedule failed to support 
responding, a single response produced the reinforcer. 
The adjusting schedule maintained a high response 
rate without long pauses; only occasionally did the 
FR | condition operate. Comparisons with other data 
suggest that the average rate was higher than might 
be expected with an FR 160 schedule alone. 

In a second experiment (pp. 720-721) Ferster and 
Skinner employed a continuously adjusting schedule. 
(Actually, it was an interlocking schedule, since, in a 
given component one paramcter decreased as a func- 
tion of the other.) Whenever the initial pause was less 
than 25 sec, the ratio increased by 5 responses. During 
the first 25 sec the ratio decreased slowly unless there 
was a response (the rate of decrease is not specified). 
Responding was maintained in three birds with ratios 
of 445, 600, and 650 responses respectively. Such ratio 
values do not sustain behavior easily; for them to do 
so, it appears necessary to adjust requirements to the 
ongoing behavior. 

The most systematic study of adjusting schedules 
was that of Kelleher, Fry, and Cook (1964, Experi- 
ment 1) with squirrel monkeys. It differed from the 
Ferster and Skinner experiments in that pausing 
affected the ratio requirement in a subsequent rather 
than in the current component (by Ferster and Skin- 
ner’s definition, it was an adjusting as opposed to an 
interlocking schedule). A session began with an FR 10 
schedule. If two successive postreinforcement pauses 
were shorter than time t, the ratio changed to the next 
higher value. Sixteen ratio values increased in steps 
from FR 10 to FR 1000. If two successive pauses ex- 
ceeded t, the ratio decreased by one step (except that 
one pause exceeding ¢ would reduce the ratio from 
FR 1000 to FR 870). The ratio remained constant if 
two successive pause durations were alternately longer 
and shorter than t. The values of t were 1, 2, 4, 8, and 
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15 min. Kelleher et al. pointed out that under simple 
fixed-ratio schedules, performance usually shows long 
pauses with schedules higher than FR 100. With the 
adjusting schedule, the average maintained fixed-ratio 
value increased with increases in t, but all levels of ¢ 
produced averages higher than 100 responses per rein- 
forcer. Average ratio values over 200 were typical with 
t values from 2 to 8 min, and the average number of 
responses was over 400 when ¢ was 15 min. So a fixed- 
ratio schedule that lessens the response requirement 
when responding weakens maintains a substantial 
level of responding. 


INTERLOCKING FIXED-RATIO, 
FIXED-INTERVAL SCHEDULES 


A schedule that would seem to adjust ratio require- 
ments to the momentary strength of responding is the 
interlocking fixed-ratio, fixed-interval schedule. In 
such a schedule, the fixed-ratio value decreases as time 
clapses since the preceding reinforcer and reaches 
FR 1 when the interval has elapsed. The rate of de- 
crease 1n the ratio value is a function of the interval 
parameter: longer intervals mean slower decreases. If 
responses are emitted at a high rate, the reinforcer 
occurs after a number of responses close to the initial 
ratio value; if responding is slow, the reinforcer occurs 
after fewer responses. 

Parametric data on interlocking FR FI schedules 
were reported by Berryman and Nevin (1962). Unfors 
tunately, from the point of view of the present argu- 
ment, the ratio value used (FR 32) itself maintained 
responding quite well, Combining the FR $2 schedule 
with a fixed-interval schedule in an interlocking FR 
FI reduced the overall rate of responding, with the 
amount of decrease positively related to the interval 
value. Apparently, the possibility of a reinforcer 
occurring when behavior weakens is unnecessary with 
small ratio schedules; under these conditions the 
occasional delivery of a reinforcer via an interval 
schedule seems to suppress responding. An interlock- 
ing FI FR schedule should markedly increase respond- 
ing if the ratio schedule is so large that it does not 
maintain behavior very well by itself. This prediction 
has not yet been tested. 


RESPONSE PATTERNING: THE 
TEMPORAL ORGANIZATION 
OF BEHAVIOR 


Different schedules have distinctive effects on the 
way responses are distributed in the time between suc- 
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cessive reinforcer presentations. These patterns of re- 
sponding are stable and characteristic of the schedule. 

The fixed-interval pattern is Skinner’s (1938) third- 
order deviation from responding at a steady rate. Skin- 
ner describes it as follows: “Deviations of a third 
order appear as depressions in the rate of elicitation 
after the periodic reconditioning of the reflex. They 
are followed typically by compensatory increases, so 
that the total rate is unchanged” (p. 125). Much sub- 
sequent research has confirmed that there typically is a 
pause followed by responding. Postpause responding 
may display either continuous positive acceleration, 
positive acceleration changing to a steady high rate, 
or an abrupt transition from no responding to a high 
rate (e.g., Cumming & Schoenfeld, 1958; Ferster & 
Skinner, 1957; Schneider, 1969). Responding under 
variable-interval and random-interval schedules is 
fairly constant between successive reinforcers. Catania 
and Reynolds (1968) found that the precise nature of 
the variable-interval pattern depended on how the 
constituent intervals were selected and how they were 
distributed. Response rate can be either positively 
accelerated, negatively accelerated, or approximately 
linéar dépending on the distribution of unterrein- 
forcer intervals. Ratio schedules also produce charac- 
teristic patterns. With fixed-ratio schedules, there is a 
pause followed by a transition te responding at a high 
rate: the pause duration 1s a function of the fixed-ratio 
size (e.g., Felton & Lyon, 1966; Powell, 1968). Fre- 
quently, the transition is abrupt, but sometimes there 
is positive acceleration, and at times responding ma 
slow somewhat at the énd of the ratio ({Ferster & Skin- 
ner, 1957), Under moderately valued schedules (cg.. 
FR 30) the rate is often very stable after the initial 
transition (Gott & Weiss, 1972: Weiss & Cott, 1979). 
Variable-ratio and random-ratio schedules tend to 
produce responding at a high steady rate, at least as 
judged from cumulative records (Ferster & Skinner, 
1957; Schoenfeld et al., 1972). What variables are re- 
sponsible for the patterns engendered by each 
schedule? 


Temporal Placement of the Reinforcing Stimulus 


A differentiating characteristic of interval schedules 
is the regularity of reinforcer presentation in relation 
to time. In fixed-interval schedules the reinforcer 
occurs at regular times (given that response rate is 
sufficiently high—as it invariably is—to produce the 
reinforcer close to when it becomes available), and in 
variable-interval and random-interval schedules it 
occurs irregularly. The placement of the stimulus in 
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time plays a major role in determining the pattern of 
responding. 


Interval and Time Schedules 


Shifting from an interval to a time schedule main- 
tains the temporal placement of the reinforcing stim- 
ulus while changing the tendency to respond. Interval 
and time schedules can be matched for minimum 
interreinforcer interval, but they differ in their ability 
to maintain a specific response. ‘Time schedules main- 
tain a lower response rate than do interval schedules 
(Appel & Hiss, 1962; Herrnstein, 1966; Zeiler, 1968). 
In fact, it is not unusual for the response to stop alto- 
gether under time schedules. 

Interval and time schedules do, however, maintain 
the same pattern of responding (see Kigure 1). One 
experament (Zeiler, 1968) studied the effects of chang- 
ing from variable to fixed schedules and vice versa: 
the transitions involved both interval and time sched- 
ules. If the preceding schedule was of the same type in 
tcrms of reinforcer regularity. the same pattern was 
maintained: if the schedules were of different types, 
the pattern changed accordingly. In this respect it 
made no difference whether the schedules were inter- 
val or time, although there were large effects on re- 
sponse vata. The téitiporal pattern of reinforcer de- 
liyeriss centrelled the pattern of responding. 

Interval and time schedules generate similar pat- 
terns despite the differences they produce in the be- 
havidr oécurring at the moment of reinforcer presen- 
tation (the interval schedule guarantees that it be a 
specific response, whereas the time schedule allows the 
response to vary). These observations are not in ac- 
cord with analyses of patterning that stress the impor- 


tanceé of the events occurring in close contiguity with 
Cc 


indication that the precise quantitative relations ob- 
taining between responses and reinforcers do not con- 
trol patterning. Dews compared performance on a 
fixed-interval schedule with that occurring when a 
small fixed-ratio requirement or a short delay of rein- 
forcement was added to the end of the interval. Some 
of these manipulations changed the rate of respond- 
ing, but none affected the pattern. As long as the 
temporal placement of the reinforcing stimulus was 
not markedly disturbed, the pattern was unchanged. 


Quantitative Measurement of Temporal 


Placement of the Reinforcing Stimulus 


Fixed and variable schedules can be differentiated 
in terms of variability of interreinforcer time (interval 
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and time schedules) or number of responses per rein- 
forcer (ratio schedules). With the fixed schedules, the 
parameter value indicates where each reinforcer is 
located; however, with variable schedules, the param- 
eter indicates only the average location. (In random 
schedules there is no indication of the location but 
simply information about the probability of reinforcer 
presentation at any moment or for any response.) In 
an attempt to describe responding under variable- 
interval schedules, Gatania and Reynolds (1968) have 
related response rate at a given point in postreinforcer 
time to the availability of the reinforcer at that point. 
They proposed two measures of temporal placement 
of the reinforcer: (1) probability of reinforcement; 
and (2) local rate of reinforcement. 


PropasiLity OF REINFORCEMENT 


One measure of reinforcement density within a 
timeé period is how often reinforcers occur in a given 
interval relative to how often the interval occurs. 
Catania and Reynolds (1968) call this probabilty of 
reinforcement, or reinforcements per opportunity. 
Consider a variable-interval l-min schedule with the 
following 20 interreinforcer intervals occurring in 
some irregular order: 3, 7, 10, 22, 28, 30, 42, 45, 50, 54, 
62, 65, 66, 67, 85, 90, 95, 105, 184, and 140 sec. ‘These 
individual intervals can be categorized in successive 
Z0-sec Class intervals as shown in column (a) of Table 
1. The rasult is seven class intervals. Column (b) shows 
the number of reinforcers in each of these class inter- 
vals. Not each class interval has an equal opportunity 
for occurrence, since each can only occur when a 
shorter interval is not in force. For example, the 21- 
AQ-sée class interval exists only when one of the thrée 
members of the 0-20-sec class is net in effect, whereas 
the 121-140-sec class interval can only occur if no 
individual interval less than 121 sec is in force. 
Column (c) shows the number of opportunities for 
each class interval in each cycle of the 20 individual 
interreinforcer intervals. The probability of rein- 
forcement for each class interval is computed by 
dividing column (b) by column (c). These probabil- 
ities are shown in column (d); they represent the prob- 
ability of a reinforcer presentation in that region of 
time since the last reinforcer, given that the region 
has been reached. 

Catania and Reynolds (1968) studied a number of 
different types of variable-interval schedules differ- 
entiated by the way the individual interreinforcer 
intervals were selected. All of these had the same 
mean value; 1.e., each was a particular-valued variable- 
interval schedule, but they all differed in the probabil- 
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Table 1 Quantitative Measurement of Reinforcer Placement in VI Schedules 
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(2) (3) (¢) (4) (¢) (f) 
Class Probability of | Time Spent Local Rate of 
Intervals |§ Number of Reinforcer in Interval Reinforcement 
(sec) Reinforcers Opportunities Presentation (sec) (Reinforcers/hr) 
a ee 
0-20 3 20 150 360 30.0 
21-40 3 17 .176 298 36.2 
41-60 4. 14 .286 228 63.2 
61-80 4 10 400 136 105.9 
81-100 3 6 .500 87 124.] 
101-120 ] 3 333 At 81.8 
121-140 2 Z 1.000 32 225.0 


ity of reinforcement occurring with respect to succes- 
sive class intervals. In three types of VI (arithmetic, 
linear, constant probability), reinforcement probabil- 
ity in an interval and the response rate in that inter- 
val (local response rate) both increased as time 
elapsed. However, in two other types (geometric, 
Fibonacci), local response rate decreased as probability 
of reinforcement increased. Therefore, probability of 
reinforcement in an interval cannot generally explain 
local response rate under variable-interval schedules. 


LocaL RATE or REINFORCEMENT 


Catania and Reynolds devised another measure of 
reinforcer density. This measure takes into account 
not simply how often a given class interval securs, 
but also how much time is spent in that interval rela- 
tive to the number of reinforcer presentations. Con- 
sider the 0-20-sec class interval in Table 1. Except 
when the 3-, 7-, or 10-sec interreinforcer intervals are 
in effect, the interval lasts for 20 sec. (This computa- 
tional procedure is modified slightly from Catania 
and Reynolds by not splitting the time between suc- 
cessive Classes.) This will occur in 17 of the 20 inter- 
reinforcer intervals, yielding a total time of 340 sec 
spent in the 0-20-sec class interval. In addition, the 
organism will spend 3 sec in that class interval when 
the 3-sec interval is scheduled, 7 sec when the 7-sec 
interval is in effect, and 10 sec when the 10-sec interval 
prevails. Over the 20 interreinforcer intervals, there- 
fore, the 0-20-sec class interval will be in effect for a 
total of 360 sec. The time spent in each class interval 
is shown in column (e) of Table 1. The 21—40-sec class 
interval will be in effect a total of 298 sec, because 
there will be 3 intervals in which the class is never 
reached, 14 in which it will be in effect for the entire 
20 sec (the 14 interreinforcer intervals longer than 40 
sec), and a total of 18 sec (divided among the 22-, 28-, 


and 30-sec intervals) in which the class begins but is 
terminated prior to 40 sec. The computations for the 
other class intervals are equivalent. The local rate of 
reinforcement (column f) is computed by dividing the 
number of reinforcers presented in a class interval 
(column b) by the total time spent in that interyal 
(column e). 

Figure 10 depicts the local rate of reinforcement 
and the rates of responding at successive time periods 
after reinforcer presentation in several types of Vvari- 
able-interval schedule (Catania & Reynolds, 1968}, 
The correspondence between lecal response rate and 
local rate of reinforcement was Closer than the corre: 
spondence between local response rate and probabil. 
ity of reinforcement. ‘The fit 18 not éxact, primarily be- 
cause of departures at lew and intermediate values, 
but the curves do not diverge as they sometimes do 
for probability of reinforcement. 
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Fig. 10. Local rate of reinforcement (upper panels) in reinforce- 
ments per hr and local rate of responding (lower panels) on 
four different types of variable-interval schedule. Time units 
on the abscissa are interreinforcer times relative to the average 
interreinforcer interval (the VI value). (Data are for pigeons 
from Catania & Reynolds, 1968.) 
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Local Rate of Reinforcement and the 


Fixed-interval Pattern 


Catania and Reynolds (1968, Experiment 4) found 
that the local rate of reinforcement analysis did not 
describe fixed-interval performance. Note that with 
fixed-interval schedules the local rate of reinforce- 
ment is 0.00 for all class intervals save the Jast, when 
it is 1,00. Yet substantial responding occurs prior to 
the end of the interval. 

Several theorists have proposed that fixed-interval 
performance constitutes two phases, one correspond- 
ing to the period of not responding (the pause period) 
and the other corresponding to the period of respond- 
ing (the work period). This gencral concept originated 
with Skinner (1938) and has been elaborated by 
{1969} and Shull, Cuilkey, and Witty 
(1972). Schneider noted, as Shull (1970) later con: 
firmed. that the work period is of variable duration 
{the period of pausing varies from one interval te the 
next). Hc suggested, therefore, that the work period 
is correlated with 9 variable-interval schedule, where- 
as the pause period is a period of extinction, Ac- 
cording to Schneider, the pattern observed under a 
fixed-interval schedule ig the outcome of extinction 
followed by a variable-interval schedule. 

Schneider’s account suggests the following applica- 
tion of Catania and Reynolds’ (1968) local rate of 
réinforcémeént analysis. Define the interreinforcer in- 
tervals by measuring the time spent responding in 
cach individual fixed interval. The only difference 
from VI is that the interreinforcer intervals are cal- 
éiilatéd from béhavidr rathér than 1mm poséd directly. 
Compute the local rates of reinforcement from these 
values. For example, if, under an FI 5-min schedule 
the organism began responding 3 min prior to the end 
of a particular interval, the value of 180 sec would be 
assigned to that interval. Consider 10 such intervals 
having the values of 5, 30, AO, 65, 100, 110, 115, 130, 
145, and 160 sec. Column (a) of ‘Table D shows these 
values arranged into 20-sec class intervals, and column 
(b) shows the total amount of time spent in each. For 
example, if responding began 65 sec before reinforcer 
presentation, 20 sec was spent in each of class inter- 
vals 0-20, 21-40, and 41-60, and 5 sec was spent in 
interval 61-80. No time was spent in any of the longer 
classes. The number of reinforcers obtained in each 
class interval appears in column (c). For example, if 
responding began 30 sec before the end of the fixed 
interval, the reinforcer occurred in the 21—40-sec class. 
The local rate of reinforcement for each class interval 
(column d) was obtained by dividing column (c) by 
column (b). 


Schneider 
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Table 2. Local Rate of Reinforcement in FI Schedules 


(4) (2) (c)” (4) 
Class Time Spent Local Rate of 
Intervals inInterval Number of Reinforcement 
(SEC) (Sec) Reinforcers (Reinforcers/hr) 
0-20 185 i 19.5 
21-40 168 Z 42.9 
41-60 140 0 0.0 
61-80 124 ] 29.0 
81-100 109 1 33.0 
101-120° . 83 z 86.7 
121—140 49 1 73.5 
141~160 23 2 313.0 


The local response rate in a given class interval is 
thé total number of responses occurring in that inter- 
val divided by the total time spent in that interval. 
‘This procedure is analogous to that used with variable- 
interval schedules by Catania and Reynolds, except 
that the segments are defined with respect to the on- 
set of responding in each interval. 

Figure 11 shows the local rate of responding and 
the local rate of reinforcement computed as described 
above for 150 successive reinforcer presentations un- 
der an FI 5-min schedule. The detailed correspon- 
dence between the two curves is not close, although 
generally both curves increase for three of the four 
birds. The curvés resemble those found by Catania 
and Reynolds with arithmetic variable-interval sched- 
ules. Hence fixed-interval performance follows no 
more (but also no less) precisely from a local rate of 
réinforcement analysis than does variable-interval 
performance, The conceptualization of fixed-interval 
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Fig. 11. Local rate of reinforcement in reinforcements per hr 
and local rate of responding on a FI 5-min schedule in pigeons. 
The reference point is the break point. See text for a detailed 
explanation. (Davis & Zeiler, unpublished data.) 
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performance as a combination of an extinction and 
a variable-interval schedule does not conflict with 
what is known about variable-interval performance. 
Whether this will help in understanding either fixed- 
interval or variable-interval behavior is still unclear. 
One is now confronted with explaining variable- 
interval performance, and it perhaps may be best 
understood as a combination of fixed-interval sched- 
ules. 


Patterning Under Ratio Schedules 


Ratio schedules establish certain inevitable rela- 
tions between time and the reinforcing stimulus, al- 
though these are indirect rather than direct effects of 
the schedule. In fixed-ratio schedules the time that 
elapses from the first opportunity to respond until the 
reinforcer appears depends on ratio size and response 
rate. A reason for the initial pause and for its direct 
relation to ratio size (Felton & Lyon, 1966: Ferster & 
Skinner, 1957; Powell, 1968) may be the relation be- 
tween one reinforcer and the time until the next. The 
pausé might be attributed to the zero probability of 
reinforcement fer the first response, but then it is 
difficult to understand why the first response should 
ever occur. Since the differential presentation of the 
reintorcér with respect to time always produces re- 
sponding prior to the time the reinforcer is available, 
a pause followed by responding occurs. The impor- 
tance of the interreinforcer interval is further implied 
by Killeen’s (1969) finding that pausing was about the 
same in a fixed-ratio schedule and in an interval 
schedule yoked to it in terms of interreinforcer time. 
The shorter pause under variable-ratio schedules 
would follow from the occasional occurrence of the 
reinforcer close in time to the preceding one. Given 
the possibility of temporal control, it would be of 
interest if Catania and Reynolds’ (1968) local rate of 
reinforcement analysis of variable-interval perfor- 
mance would apply equally well to behavior under 
variable-ratio schedules. This account of pausing fol- 
lowed by responding under ratio schedules is similar 
to that offered by Skinner (1938), Ferster and Skinner 
(1957), and Morse (1966). 


Temporal Placement and Not Responding 


Staddon’s (1970, 1972) experiments provide further 
evidence that the pattern of responding depends on 
the temporal placement of the reinforcing event in 
relation to responses. During some intervals food was 
available in the first 60 sec following a food presenta- 
tion; during other intervals it was only available after 
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60 sec. Under one condition any food presentation 
available in the first 60 sec occurred only if a key peck 
had not occurred for 10 sec (R > 10 sec). If the food 
was available after 60 sec, it was produced by pecking 
according to a variable-interval schedule. Under an- 
other procedure the schedule orders were reversed: a 
peck was required in the first 60 sec, and the R>t 
schedule was in effect thereafter. 

The first procedure produced pauses followed by a 
substantial rate of key pecking, and the second pro- 
duced pecking followed by pausing. Thus the tem- 
poral placement of the reinforcing event with respect 
to two different forms of behavior determined the 
probability with which each occurred at a given point 
in time. 


The Components of Temporal Placement 


The temporal pattern of reinforcers explains the 
pattern of responding on different schedules. Temporal 
placement refers to the occurrence of reinforcers in 
time in relation to some reference point, eg., the last 
reinforcer presentation or the beginning of a trial. 
Attempts to further analyze temporal placement have 
focused on two processes: temporal discrimination and 


delay of reinforcement. 


‘THE THEORETICAL BACKCROUND 


The use of delayed reinforcement to explain pat- 
terning in sequences originated with Hull's (1932) ac- 
count of why rats chose the shorter of two paths to the 
same goal and why they ran faster in the first section 
of a short alley than in the first section of a longer al- 
ley. Because the distance to reward differs, responding 
is reinforced with different delays. On the assumption 
that habit strength is inversely related to delay, choice 
of the shorter path or faster running in the shorter 
alley indicates that sequences of responses are influ- 
enced by their temporal remoteness from the reinforc- 
ing stimulus. 

In 1952 Hull distinguished between experiments 
involving short and long alleys and those involving a 
delay period imposed after a single response. The first 
type presents reinforcers immediately upon the com- 
pletion of the final response of the entire sequence 
(1.€., upon entering the goal box), so that the rein- 
forcer is delayed with respect to earlier parts of the 
sequence. Hull described such procedures as involving 
a gradient of reinforcement within a chain (chaining 
delay). ‘he second procedure imposes delay after the 
terminal response; it does not involve intrinsic de- 
lays due to the relations between early and later 
parts of sequences (nonchaining delay). 
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Spence (1956) pointed out that Hull distinguished 
chaining and nonchaining delay procedurally but not 
theoretically. Spence, in contrast, believed them to be 
the outcome of different processes. He proposed that 
behavior in the chaining situation was due to the 
similarity of goal box stimuli to stimuli occurring 
earlier in the chain. Response decrements in the non- 
chaining situation were attributed to other responses 
emitted during the delay period, and because they 
occur closer in time to the reinforcer these other re- 
sponses compete with the target response. Whereas 
Hull believed that response strength was the outcome 
of delayed reinforcement in both chaining and non- 
chaining délay situations, Spence maintained that the 
two situations had little in common and that neither 
represented 4 direct functional relation between re- 
sponse strength and delay. 

Dews’s (1962) interpretation of fived-interval sched- 
ules ig very stmilar to Hull’s analysis of chaining de- 
lay, and Dews, like Hull, used data from nonchaining- 
delay experiments to support the hypothesis. Skinner 
(1938), on the other hand, separated the two situa- 
tions. His explanation of fixed-interval performance 
resembled Spence’s account of chaining delay. It drew 
heavily on clapsed time as a stimulus; i.c.. responding 
depends on the similarity of the temporal stimuli at 
any point to the temporal stimulus correlated with 
reinforser presentation, This account emphasized stim- 
ulus propertics rather than a theoretical gradient of 
reinforcement. Since Skinner als6 déalt with non- 
chaining delay in terms of competing adventitiously 
reinforced responses, his accounts of both chaining 
and nonchaining delay are similar to Spence’s. 

In the case of chaining delay, then, the issue is 
whether réspondiiig depends on the proximity to the 
end of the chain (Hull, Dews) or on the similarity of 
the conditions at any point to those present at the 
moment of reinforcer presentation (Skinner, Spence). 
Is fixed-interval performance due to the effect of a 
oradient of reinforcement on response strength, or 
does it depend on the time since the beginning of the 
interval in relation to the total interval duration? An 
answer to this question would indicate whether delay 
of reinforcement or temporal discrimination is the 
major controlling aspect of the temporal placement of 
reinforcement. 


"TEMPORAL DISCRIMINATION 


The term temporal discrimination has a confusing 
history. Sometimes it has been used as a process to ex- 
plain behavior—e.g., the fixed-interval pattern is the 
outcome of the discrimination of time. But this shifted 
the problem to an explanation of how time comes to 
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be discriminated, and it has never gone beyond a re- 
iteration of the independent variables; e.g., it happens 
when reinforcement occurs at a certain point in time. 
Thus the process interpretation simply tacks on tem- 
poral discrimination to the relation between the inde- 
pendent variable and observable performance. A more 
meaningful usage of temporal discrimination proposes 
that elapsed time may function as a stimulus. A test- 
able implication is that the time elapsing from the start 
of an interval has discriminative properties. Since, in a 
fixed-interval schedule, responding is never reinforced 
just after the interval begins, this period may serve as 
a discriminative stimulus controlling a low rate of 
responding. The same is true of fixed-ratio schedules, 
because responding just after reinforcer presentation 
is never followed by another reinforcer. The analysis 
can also be extended to patterning under variable 
schedules, where a reinforcer does sometimes occur 
just aftcr the preceding one. The period immediately 
following reinforcer presentation does not come to 
function as a negative discriminative stimulus, and 
responses are emitted at a more stable rate. 


EXPERIMENTS ON TEMPORAL 
DISCRIMINATION 


Catania (1970) described several experiments deal- 
ing with the discrimination of time. The temporal dis- 
crimination experiment is identical to other experi- 
ments on stimulus control, differing only in that time is 
the stimulus dimension manipulated. In a study of 
wavelength discrimination, the experimenter correlates 
certain wavelengths with a reinforcing stimulus and 
others with different consequences; in a study of tem- 
poral discrimination, different time durations are 
selectively correlated with the reinforcer. 

In a temporal discrimination experiment, Stubbs 
(1968) had the center member of a three-key display 
transilluminated with white light for 1 of 10 dura- 
tions. Which of the two side keys was then correlated 
with reinforcer availability depended on the preced- 
ing duration: if the center key had been lit for 1 of 
the 5 shorter durations, responses to one of the keys 
produced food. If it had been lit for one of the longer 
durations, responses to the other key produced food. 
By plotting responding to each key as a function of 
the preceding white key duration, Stubbs was able to 
compute the Weber fraction for discrimination of 
time. There was clear differential responding with 
respect to duration as the antecedent stimulus. An- 
other example of a temporal discrimination experi- 
ment was provided by Reynolds and Catania (1962). 
Pigeons were exposed to a dark key for durations 
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ranging from 3 to 30 seconds. When the key was 
then illuminated, responding was reinforced only if 
the preceding dark key duration had been a specific 
value (e.g., 3 sec). The birds responded at the highest 
rate following the duration correlated with reinforcer 
availability and at progressively lower rates with in- 
creasingly different durations. Clearly, duration can 
have discriminative properties. 


DELAY OF REINFORCEMENT 


Delay of reinforcement describes the time from a 
response until the reinforcer appears. Dews (1962) 
hypothesized that delay of reinforcement is a deter- 
minant of fixed-interval patterning, since the rein- 
forcer follows not only the last response in the inter- 
val, but also earlier ones, albeit with longer delays. 
Responses occurring closest to the reinforcing stimu- 
lus are subjected to the shortest delay; responses more 
remote from reinforcement are subjected to longer 
delays. ““The progressive increase in rate of respond- 
ing through the fixed interval would be based on a 
declining retroactive yate-enhancing effect of the 
reinforcing stimuli as the delay between respons¢ and 
reinforcement is increased” (Dews, 1962, p. 373). 


STUDIES OF DELAVED REINFORCEMENT 


In order to evaluate the possibility that delay of 
reinforcement influences fixed-interval performance, 
it is necessary to observe its direct effects. Such re- 
search has manipulated the delay interval by having 
the reinforcer occur at varied times after the response, 
The experiments have differed in whether or not re- 
sponses during the delay period reset the delay timer 
and in whether or not there was a distinctive stimulus 
correlated with the delay period. 

A distinctive stimulus correlated with the delay 
usually eliminates responding during that period; thus 
it makes little difference whether or not responses 
reset the delay timer. Responding may either decrease 
as a function of delay (e.g., Hamilton, 1929; Pierce, 
Hanford, & Zimmerman, 1972) or be maintained with 
delays as long as 24 hr (Azzi, Fix, Keller, & Rocha e 
Silva, 1964; Ferster, 1953; Ferster & Hammer, 1965). 
These correlated stimulus conditions would not, in 
any event, seem to be relevant to fixed-interval per- 
formance, since the fixed-interval schedule provides 
no delay stimulus. 

In the absence of distinctive stimuli, there is a 
difference depending on whether or not responses 
during the delay period reset the delay timer. If each 
response prolongs the delay (reset condition), respond- 
ing typically declines with increased delays (Azzi et al., 
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1964; Dews, 1960; Skinner, 1938, pp. 139 ff.). ‘This pro- 
cedure also has little relevance to fixed-interval sched- 
ules when the presumed delay period for each response 
is unaffected by subsequent responses. 

The situation most nearly equivalent to fixed-inter- 
val schedules is one without delay stimuli and resets. 
Dews (1960) arranged delays of 10, 30, or 100 sec be- 
tween a key peck and food delivery. Responding was 
maintained with the 100-sec delay, but at a lower rate 
than with the 10-sec delay. Of the two birds given the 
30-sec delay, one had a higher rate at 30 sec than at 
10 sec, and the other had a lower rate at 30 sec. To 
some extent, then, the function relating response rate 
to delay of reinforcement corresponds to the hypothe- 
sized role played by delay in fixed-interval perfor- 
mance. 

Explicit delays of reinforcement added to fixed- 
interval schedules have large effects whether responses 
do (Skinner, 1938, pp. 139-150) or do not (Dews, 1969) 
reset the delay timer. Skinner used either an FI 4min 
or an FI 5-min schedule with delays of reinforcement 
ranging from 0 to 8 see in 9-8ee steps. Rate declined 
337 with a 2-scc delay and more than 50% with 8 sec; 
in fact, Skinner concluded that the 6-sec delay preb- 
ably did not maintain responding at all. Dews added 
a l-sac delay of reinforcement to an FI 3-min ¢ched- 
ule. Because responses did not reset thé delay timer, 
the actual delay could range from 0 te 1 sec (Dews 
mentions that the average delay was about 230 msec). 
Response rate was about half that occurring with 1m- 
meédiate veinforcey presentation. These data suppest 
that a reinforcing stimulus has a small effective range 
in time. If delays as short as 1 sec sharply reduce 
response rate, the high average level of respond- 
ing maintained minutes er syen heurs prior to the 
appearance of the reinforcer under fixed-interval sched- 
ules cannot be attributed to the delay of reinforce- 
ment gradient. Morgan (1970) drew the same conclu- 
sion from his comparison of fixed-interval schedules 
with a conjunctive fixed-time, fixed-ratio schedule 
that maintained the same single-response requirement 
and interreinforcer interval, but did not guarantee 
that a response be contiguous with the reinforcer. 
Skinner’s, Dews’s, and Morgan’s experiments demon- 
strate that substantial responding is maintained only 
if there is very close temporal contiguity between the 
final response and the reinforcer. 


DELAY OF REINFORCEMENT IN 
SCHEDULES INVOLVING 
RESPONSE NUMBER 


Delay of reinforcement may operate when sev- 
eral responses must precede each reinforcer presen- 
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tation. Catania (1971) required a sequence of re- 
sponses to either one or two keys. If all of the 
responses except the first had to be to one key (key 
A), while the first response had to be to the other 
key (key B), the rate of responding to key B decreased 
as the number of subsequent key A responses was in- 
creased from 1 to 11. In another experiment, the 
sequence was always four responses. In one set of con- 
ditions the last response and two of the other three 
had to be to key A, while the remaining response had 
to be to key B. The closer the key B response was to 
the reinforcer, the higher was the rate of responding 
to key B. Catania attributed the results to the differ- 
ential delay of reinforcement inherent in his pro- 
cedures. The closer key B responses were to the rein- 
forcer, the shorter the delay and hence the higher the 
response rate. 

Catania (1971) observed that when the required se- 
quence of responses was changed, the interreinforcer 
time increased substantially at first but then decreased. 
The decrease implies that the probability of the spe- 
cifically required sequence increased. Yet Gatania re- 
ported that stereotyped sequences did not occur. 
Perhaps the changes in interreinforcer timé aré at- 
tributable to changed interactions between two inde- 
pendent responses occurring at different rates. They 
could also be due to modification of the emitted re- 
sponse unit. 


INTERVAL RELATIVITV 


Further consideration indicates that temporal dis 
crimination and delay of reinforcement actually are 
not distinguishable in interval schedules. The tem- 
poral relations in tw6 types of experimental arrange- 
mént aré shown in Figure 19 (derived from Jenkins, 
1970), The free operant procedure describes the 
typical fixed-interval or fixed-time schedule in which 
the prevailing exteroceptive stimuli do not change 
until the reinforcer appears. Some event (time zero, 
or J) marks the beginning of the interval; often 
this event is the end of a reinforcer presentation 
accompanied by a stimulus change such as the il- 
lumination of the key and/or houselight. The orig- 
inating event also could be the end of a blackout 
period, or any other stimulus that marks the onset 
of the interval. There is then a continuous period 
(N-period) in which a reinforcer is not available. 
The completion of the interval (time completed, or 
i.) occurs when the reinforcer appears. Any par- 
ticular point within the N-period is referred to as N 
and can be localized either with respect to To (the 
T-N interval) or T, (the N-—T, interval), Another 
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procedure involves discrete trials. Responding is not 
free to occur at any time, but is limited to periods 
marked by distinctive stimuli and separated by inter- 
trial intervals during which responses have no sched- 
uled consequences. In some trials (R-trials) a rein- 
forcer is available: in others (N-trials) it is not. There 
can be a series of N-trials between successive R-trials 
(e.g., Dews, 1966a), or, as shown in Figure 12, there 
can be alternating N- and R-trials (e.g., Dews, 1962; 
Jenkins, 1970). The location of an N-trial is specified 
either by the time from the end of the preceding 
f-trial to the end of the N-trial (the Ro—N interval) 
or the time from the end of the N-trial to the begin- 
ning of the next R-trial (the N-R, interval). The in- 
terreinforcer intervals are T,)—-T, in the free operant 
procedure and Ro-R, in the trial procedure. 

‘The Ty)-N interval as the controlling factor ap- 
pears in accounts that emphasize clapsed time as a 
discriminative stimulus (time from the beginning of 
the interval); the N—T), interval is crucial for delay-of- 
reinforcement explanations (the time between a re- 
sponse and the next reinforcer presentation). It is clear 
from Figure 12, however, that the T,—-N and N-T, 
intervals are not independent, given that the 7 -T, 
interval (the total fixed interval) is held constant. 
Since the two are perfectly negatively correlated, their 
effects cannot be separated. In addition, two lines of 
evidence suggest that they must be considered to- 
gether. 

When Dews (1970) plotted relative rate in cach 
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Figure 12. Temporal intervals. The upper segment (free oper- 
ant) illustrates the typical fixed-interval schedule. T, indicates 
the stimulus change indicating the beginning of the interval. 
The stimulus continues until T “a i.e., the presentation of the 
reinforcer that ends the interval. N is any point during the 
period between T) and T.. The lower segment (trial procedure) 
illustrates the condition involving restricted opportunities to 
respond. R refers to trials in which a reinforcer is available, N 
to trials in which it is not. 
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one-fifth interval, curves for fixed intervals of 30 sec, 
300 sec, and 3000 sec fell on top of one another. The 
relative patterns were independent of fixed-interval 
value. Even though equivalent fifths represented very 
different absolute time periods from the beginning or 
to the end of the fixed interval, relative position and 
patterning was constant. 

Jenkins (1970) used a trial procedure to separate, 
absolute time from the beginning, and time to the end 
of the interval. The variables manipulated were the 
time from the end of one R-trial to the end of the 
N-trial (Ro-N interval) and the time from the end of 
the N-trial to the beginning of the next R-trial (the 
N-R, interval). (Jenkins described the independent 
variables somewhat differently; he measured time 
from the beginning of each trial. The present descrip- 
tion is used to facilitate the reanalysis of his data.) 
Table 3 shows eight conditions (columns a and b) and 
the results (column f). 


Table 3 Intervals Studied by Jenkins (1970) 


INTERVALS 
(max. duration in sec) 


(2) (b) (¢) (4) (¢) (f 


Average 


RELATIVE PROXIMITY 


Ry, R,-N/ N-R,/ Responses 

RN N_R ; Interval R ok 4 R,-R con N-ivials 
54 § 62 O7 A3 G7 
154 8 162 195° OD 8.6 
5A 108 169 39 67 9.3 
154 108 962 59 Al 3.5 
154 g 162 95 O5 12.9 
154 207 361 43 5T 6.7 
54 20 74 45 als 6.1 


154 20 174 89 stl 6.6 


All intervals are given in terms of their maximum duration 
in sec. 


Neither the absolute R,—N nor N—K, intervals in- 
dependently determined responding in the N-trials, 
since responding was not closely related to the value 
of either. But it is important to note that whenever 
there was a change in either the R,—-N or the N-R, 
intervals, there was either a change in the other as 
well and/or a change in the time from the end of one 
R-trial to the beginning of the next (the Ro—R, inter- 
val). Column (c) of ‘Table 3 shows the Ry—R, interval 
for each condition. Columns (d) and (e) show the 
R,-N and N-R, intervals as proportions of the Ry-R, 
intervals, i.e., they describe the relative proximity of 
an N-trial to the overall interreinforcer period. 

Although analyses based on the absolute quantita- 
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tive data do not show very orderly results, the rank- 
order correlation between these relative proximity 
measures and responding on the N-trials was .88, with 
the sign being either positive or negative depending 
on whether the R,y-N or the N-R, interval is used. In 
these eight conditions of Jenkins’s discrete-trial pro- 
cedure, as in the fixed-interval schedule, responding 
at any point during the interreinforcer interval was 
related to the proportion of the total interval that had 
elapsed (or was remaining) at that point. 


CONCLUSION 


The temporal location of a point in an inter- 
val can be specified from either the beginning or 
the end of an interval. Changes in the relative loca- 
tion of the point always mean relative changes in both 
the time from the beginning of the interval and the 
time to the end of the interyal, and thus these twe 
must be totally confounded. It cannot be asserted, 
therefore, that either of the periods is responsible for 
an effect. The important factor is the location of a 
SivEer point vélative t6 the overall duration of the 
interval (cf, Jenkins’s, 1970, relative proximity prin- 
ciple). 


SEQUENCES ANB UNITS 


The Chaining Hypothesis 


Schedules of intermittent reinforcement impose or: 
der on the numerous responses that occur between 
suceassive reinfercars: the Bahavis: sanerated has a 
sequential organization, Historically, it has besn com: 
mon to explain sequences in terms of response chains 
ing. ‘Ihe essence of the chaining hypothesis is that 
cach response preyides the stumulus (er part of the 
stimulus complex) for the next response. These 
response-produced stimuli have an eliciting or dis- 
criminative stimulus function with respect to the 
next response, and they may have a reinforcing fune- 
tion with respect to preceding responses. Individual 
learning theories differ about whether the chain is 
composed of reflexes or of instrumental responses, but 
the concept of each response acting as a stimulus for 
the next is common to all. 

Platt and Johnson (1971) reviewed data showing 
that behavior can have stimulus properties. Behavior 


under mixed FR FR schedules (two fixed-ratio sched- 


ules occur in some order, but with no exteroceptive 
stimulus indicating which one is in effect) shows dif- 
ferential sensitivity to the two ratio requirements. In 
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one type of experiment, Rilling and McDiarmid 
(1965), Pliskoff and Goldiamond (1966), and others 
have shown that pigeons can make appropriate choices 
on the basis of the size of the fixed ratio just com- 
pleted. In another type of experiment, Platt and John- 
son (1971) delivered food to a rat only if the rat’s 
approach to the food tray was preceded by N lever 
presses. Approaches prior to N lever presses resulted 
in a time-out and reset the response number require- 
ment. ‘Ihe mean number of responses emitted per tray 
entry slightly exceeded the number required at all 
values of N. ‘These data support the hypothesis that 
organisms can respond differentially to their own be- 
havior. 

In all these experiments different consequences 
were arranged following the emission of different 
numbers Sf FEEPSHLEs. ‘There igs no evidense that re- 
apensca function as discriminative stimuli in the 
abscnece of cxplicit differential reinforcement. Animals 
in aRTINEHSH | after fixéd-ratio tramming will emit very 
long runs ef responses before pausing (Ferstcr & Skin- 
ncr, 1957, pp. 57-63): this provides no evidence for 
sensitivity t6 the number of responses previously re- 
quired fer reinforcer presentation. Data reported by 
Overmann and Denny (1974) suggest that the animals 
ara ecantrolled by the exteroceptive¢ stimulus changes 
correlated with completion of the ratio rather than 
by response number. Control by response number ap- 
pears to require that there be diflérent schedules cor- 
rélatéd with different numbers of responses; it does 
net arise automatically through the differential reins 
forcement of response number inherent in simple 
fixed:ratio schedules. 

Chaining has been prapaséed as being involved in 
performance uiiidér fixed-interval schedules (Ferster & 
Skinner, 1957; Skinner, 1938), but this hypothesis no 
longer seems tenable. According to the chaining hy- 
pothesis, sequential changes in responding are due to 
the stimulus properties of preceding résponses; con- 
sequently, the pattern should not survive interrup- 
tions in the How of responses. Dews (1962, 1965a, 
1965b, 1966a, 1966b) interpolated stimuli that dis- 
rupted responding at various points during fixed in- 
tervals. Although this should break the hypothetical 
chain, the basic fixed-interval pattern endured: when 
responding occurred, it was appropriate to the tem- 
poral location within the interval regardless of the 
response rate immediately preceding. Similar results 
have been obtained by others (Farmer & Schoenfeld, 
1966; Ferster & Skinner, 1957, pp. 213 ff.; McKearney, 
1970). Chaining, therefore, is not necessary for fixed- 
interval performance. 

The chaining hypothesis has limited value. Kelle- 
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her (1966) objected to it as inherently untestable be- 
cause it is based on purely inferred stimuli, while 
Lashley (1951) believed that the speed and precision 
with which many sequences are executed made it un- 
tenable to assume that each response is controlled by 
feedback from the preceding response. Data reported 
by ‘Taub and Berman (1968) and Lashley (1917) show 
that precise sequences occur in organisms deprived of 
sensory feedback from their responses, If chaining oc- 
curs, perhaps it does so only under specialized condi- 
tions of differential reinforcement. 


Response Ynits 


THE SBECIFICATION oF A UNIT 


Units other than the individual response might be 
affected by reinforcement and play a role in schedule- 
controlled performance. For example, Morse (1966) 
suggested the interres ponse time a8 4 reinforceable as- 
pect of behavier, and Staddon (1967) proposed that 
the entire temporal pattern of responding may be a 
unit. The experimental evaluation of whether a cey- 
tain form of behavior is a unit requires clarifying the 
ditterent meanings of response units. (Although re- 
sponses are emphasized here, a parallel case can be 
made for stimulus units.)Three different kinds of re- 
sponse unit can ke distinguished. These are described 
here as fermal, conditionable, and theoretical units. 

Onc kind of response unit refers to the elass of be- 
havior that the experimenter prescribes as prerequisite 
for a remforcer presentation; this is simply the opera- 
tional dehnition of the measured response. This is the 
fermal response unit. The formal unit is always un- 
ambiguous, but it need not be experimentally inter- 
esting or useful. To be so, a formal unit must obey a 
plasticity principle: its probability of occurrence 
should be affected by its consequences. Some formal 
units display this plasticity, whereas others do not. 
The term opcrant has been used to describe modifi- 
able units; here they will be referred to as condition- 
able response units. Conditionable units, like formal 
ones, are unambiguous: if some behavior is required 
for reinforcer presentation and it increases in proba- 
bility, it is a conditionable response unit. 

The term response unit may also be used to refer to 
something inferred rather than observed directly. A 
response, a stimulus-response relation, or some cogni- 
tive activity, can be postulated to underlie observed 
performance. Inferred units are being used when it is 
asserted that organisms learn turning responses, or to 
approach certain locations in a maze, or interresponse 
times, or entire sequences of behavior. These inferred 
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units will be referred to as theoretical response units. 

The interresponse time illustrates the distinction 
between formal, conditionable, and theoretical units. 
If the interresponse time is specified as the require- 
ment for the delivery of a reinforcing stimulus, it 18 
the formal unit. If it should be altered by the imposi- 
tion of these consequences, it is a conditionable unit. 
It is a theoretical unit when it is used to explain per- 
formance under a schedule in which some other be- 
havior is specified as the formal unit. For example, in 
a fixed-ratio schedule the formal unit 1s the sequence 
of required responses. However, if the behavior that 
emerges is then explained as the outcome of differen- 
tial reinforcement of short interresponse times, then 
the interresponse time is a theoretical unit. In other 
words, a theoretical unit is one that is hypothesized to 
underlie observed behavior. 

Evaluating the plausibility of a particular theo- 
retical unit requires a strategy analogous to that used 
to evaluate indirect variables: the hypothesized unit 
must be studied directly. This means that the theo- 
retical unit must be specified as a formal unit. If it 
then proves to be conditionable, and if it is affected 
as it is hypothesized to be when it is used to explain a 
given performance, the theoretical construct gains 
plausibility. 


Interresponse Time 


An interresponse time (IR'T) was initially concep- 
tualized as a stimulus. Skinner (1938, pp. 247-284) 
described time since the preceding response as a dis- 
criminative stimulus controlling the emission of the 
next response, and Anger (1956) treated the interre- 
sponse time in the same way. Reynolds (1966) demon- 
strated that an IRT could indeed function as a 
discriminative stimulus by showing that if responding 
in the second component of a chained schedule was 
followed by food only if the IRT in the first compo- 
nent exceeded 18 seconds, response rate in the second 
component varied as a function of the preceding IRT. 
Thus the IRT as an antecedent event produced difter- 
ential responding, 1.e., it exerted stimulus control over 
responding. 

Morse (1966) treated the IRT as a shaped and rein- 
forced property of behavior rather than as an ante- 
cedent stimulus for a response. In the typical schedule 
of reinforcement in which IRTs are postulated as 
controlling behavior, the IRT is a property of emitted 
behavior and is not manipulated as a stimulus. In- 
stead, stimulus properties are inferred from respond- 
ing. If the IRT is treated as a differentiated response 
unit, unobservable stimuli need not be postulated as 
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controlling observable performance. Given the one-to- 
one correspondence between response and inferred 
stimulus properties, however, the two treatments ap- 
pear to be equivalent. 


IRTs as ‘THEORETICAL UNITS 


Ferster and Skinner (1957) demonstrated that IR'T 
reinforcement could explain the response rate differ- 
ences in interval and ratio schedules. If responses tend 
to occur in bursts (Skinner’s, 1938, fourth-order devia- 
tion from a steady rate), under ratio schedules it is 
likely that the response requirement will be met during 
a burst. Therefore, the IRT correlated with reinforcer 
presentation is likely to be in the short range of those 
emitted. Ratio schedules do in fact generate homoge- 
neous sequences of short IR'Ts and those are the ones 
emitted at the moment of reinforcement (Gott & Weiss, 
1972; Weiss & Gott, 1972). In interval schedules, how- 
ever, it is more likely that the reinforcer will follow a 
period of pausing (more time clapses during a pause, 
hence making it more likely that the reinforcer will be- 
come available). Therefore, longer IR Ts will be prefer- 
entially correlated with the reinforcing stimulus. Morse 
(1966) showed formally that more short IR'T's will be 
reinforced on ratio than on interval schedules, and 
Dews (1969) confirmed that the terminal IR'T in fixed- 
interval schedules is likely to be longer than the ones 
immediately preceding. Morse also developed a model 
for explaining patterning as the consequences of the 
relation between IR'Ts and reinforcement. 

The selective reinforcement of IRTs accounts for 
what are otherwise puzzling data. Consider response 
rate under tandem FI FR schedules. Rate is higher 
than on a simple fixed-interval schedule (Ferster & 
Skinner, 1957), The addition of a fixed ratio at the 
end of the fixed interval makes it more likely that the 
reinforcer will follow a short IRT, and thereby prefer- 
entially reinforces short IRTs. If the emitted IRT dis- 
tribution changes in the direction of the reinforced 
IRT distribution, it is to be expected that rate should 
increase with the tandem schedule. 

Some data from t- and 7-schedules also fit an IRT 
analysis. In many of these schedules a reinforcer 1s 
available for a single response occurring at some point 
in time. Restricting the duration of the time period 
has large effects on response rate: rate increases as 
availability decreases. Morse (1966) has suggested that 
the period of reinforcer availability determines which 
IRTs are preferentially reinforced. The shorter the 
availability period, the more the schedule favors the 
reinforcement of short IRTs and the higher the con- 
sequent response rate. 
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THE IRT As A CONDITIONABLE 
RESPONSE UNIT 


The IRT as a theoretical unit has the potential to 
explain major aspects of schedule-controlled perfor- 
mance. What is necessary, then, is to examine IRTs as 
formal response units and see if they are conditionable. 

Tests of IRTs as conditionable responses derive from 
IRT > ¢ and IRT <1 schedules, which require that 
an IR'T either exceed or be less than a certain value 
for a reinforcer to occur. Several experiments (e.g,, 
Malott & Cumming, 1964; Richardson & Loughead, 
1974; Staddon, 1965) have shown that if an IRT must 
exceed some value, the distribution of emitted IRTs 
changes in the appropriate direction, Similar results 
occur when a terminal IRT requirement is added to a 
variable-interval schedule, i.2., when the schedule is a 
paced VI or a tandem VI IRT >¢ (Anger. 1956: 
Shimp, 1967). Since the probability of the specific 
IRT that is required increases (although at long IRT 
requirements very few of the emitted IRTs may be 
long enough to produce the reinforcer), the IRT is a 
conditionable response unit. 

Quantitative analyses of the effects of temporal 
differentiation sehedules have shown similar results 
whether the unit involved was the IRT or some aspect 
of performance other than the interresponse time 
(Catania, 1970; DeCasper & Zeiler, 1974). Performance 
éati be described by the power function T = ki”, 
where 7 is the emitted duration, ¢ is the required 
duration, and & and nm are constants fit to the data. A 
small range of values of k and » are necessary to de- 
scribe performance in most of these experiments. In 
fact, DeGasper and Zeiler have suggested that the 
effects of temporal differentiation schedules may be 
independent of particular response units and dura- 
tions. 

A problem in interpreting rate changes under 
IR'T > ¢ and other temporal differentiation schedules 
is that as the time parameter increases, the number of 
responses per reinforcer and average interreinforcer 
time both increase as well. If response rate is a func- 
tion of reinforcer density, IR'T’s would change as well 
since they are the components of rate. Actually, how- 
ever, this 1s not the major reason for the effects. Ferster 
and Skinner (1957, p. 460) found that adding an 
IRT > ¢ requirement to a variable-interval schedule 
reduced rate without markedly affecting the average 
interreinforcer time. Richardson (1973) has also shown 
that interreinforcer time is not responsible for the 
IRT distributions in simple IRT > ¢ schedules. An- 
imals were given exactly the same temporal distribu- 
tions of reinforcer presentations that they obtained 
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under IRT > ¢, but without the IRT requirement (a 
variable-interval schedule yoked to the IRT > t sched- 
ule). ‘Ihe response rates and distributions of IRTs 
obtained under the IRT > ¢ and yoked VI schedules 
differed greatly, showing that the specific differential 
reinforcement of IR’I’s was an important factor in 
IR'T > t performance. 

Alleman and Platt (1973) used what they termed a 
percentile reinforcement schedule involving IRT rein- 
forcement. Platt (1973) has described such arrange- 
ments as shaping schedules: the prescribed behavior 
at any moment shifts with the nature of the behavior 
being emitted. In Alleman and Platt’s experiments 
the critical aspects of performance were the emitted 
IRTs. Only IRTs more extreme than a certain pro- 
portion of the recently preceding IR'T’s were rein- 
forced. Under some conditions, the requirement in- 
volved IRTs shorter than those preceding; under 
others, it involved longer IRTs. For example, the 
IRT might have to be shorter than 95°, of the previ- 
ous IRTs. The emitted IR'T distribution then became 
progressively peaked at the lower end. A parametric 
study showed that when the IR'T’s had to be among 
the least frequent 5 or 10° the distribution shifted 
toward longer or shorter IRTs depending on which 
was required. ‘Vhese effects occurred even if the same 
number of responses per reinforcer were maintained 
over the range of percentile requirements. By selec- 
tively reinforcing extreme IRTs, the IR'T distributions 
changed in the appropriate direction, Blough (1966) 
showed the complementary effect: variable IRTs were 
produced by always reinforcing the least frequent ones 
without regard to direction. 

The data show, therefore, that an IRT is a condi- 
tionable response, It is reinforceable directly when it 
is specified as the formal response, and the changes in 
behavior are not attributable to changed reinforcer 
density. 


IRTS AND REINFORCED IR'TS 


What do these data imply about behavior under 
schedules not specifying the value of terminal IRTs 
directly? Anger (1956) suggested that IRT reinforce- 
ment operates to control performance with such sched- 
ules. He found that under variable-interval schedules 
the frequency distribution of IRTs correlated closely 
with the frequency distribution of reinforced IRTs. 
Those IR'T’s that were most often correlated with the 
reinforcing stimulus occurred more frequently; the 
shapes of the IR'T distributions corresponded with the 
shapes of the distribution of reinforcers per hour per 
IRT. 
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Later analyses have shown that this relation 1s 
probably not significant. Blough and Blough (1968) 
were the first to point out that the dependency of the 
reinforcer distribution on the IRT distribution 1s 
mathematically forced, because “‘on a VI schedule, the 
probability of reinforcement at a given [R’'T’ bin must 
increase with the number of responses in that bin” 
(pp. 26-27). Reynolds and McLeod (1970) showed that 
a correlation between emitted and reinforced IRTs 
holds regardless of the distribution of emitted IR'I's. 

Blough and Blough examined whether the distribu- 
tion of reinforcers among various IRTs determined 
the IRTs emitted subsequently. If this were the case, 
it might be expected that the IRT distribution would 
shift in the direction of those IRTs that earlier had 
received the most reinforcers per hour. ‘There was no 
indication that IRT distributions changed in this 
manner. 

Anger (personal communication) suggests that IRT 
reinforcement occurs in the context of the increase in 
the frequency of a response due to reinforcement, the 
decrease due to nonreinforced responses, recovery of 
responses after a period of nonreinforcement, and per- 
haps a rate-enhancing effect of extinction as well. 
These factors interact to produce oscillations in emit 
ted IRTs. Thus IRTs are the outcome of at least four 
influences rather than being simply due to frequency 
of reinforcement. 

Since 1968 Shimp and his colleagues have been de- 
veloping a systematic quantitative account of IRT 
reinforcement. The various experiments hayé in- 
volved a paced VI schedule (Anger, 1956; Ferster & 
Skinner, 1957), in which the reinforcer occurs follow- 
ing the first appropriate IRT emitted when a VI 
schedule has made the reinforcer available. Shimp’s 
experiments have involved at least two IRT bands, 
each defined by a lower and upper bound. The pro- 
cedure involves distinctive stimuli presented only 
during IRT bands that are eligible for reinforcement. 
Thus reinforcement occurs only during the stimulus, 
and the stimulus is correlated with the IRT require- 
ments. The purpose of introducing stimuli in this 
way is to gain precise control over responding, but it 
seems possible that discriminative stimulus control 
may influence the performance in other ways as well. 
In any event, the dependent measure is the percentage 
of IRTs, P, falling in one of the bands. Considering 
the shorter class, P equals the number of IR'T’s in the 
short class divided by the number of IRTs in both 
classes. 

In 1968 Shimp applied different relative frequen- 
cies or magnitudes of reinforcer presentation to 1.5— 
2.5-sec and 3.5-4.5-sec IRT’s. The value of P was ap- 
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proximately linearly related to either relative fre- 
quency or magnitude of reinforcement, but it did not 
match either. Shimp (1969) found that with equal 
reinforcement frequency for both IRT classes P 
matched the relative reciprocal (term used by Hawkes 
& Shimp, 1974, here called RR) of the classes. The 
RR was computed by considering the lower bound of 
the short class (¢,,) and the upper bound of that class 
(t, 2) together with lower (¢),,) and upper (t:,2) bounds 
of the longer class. The relative reciprocal of the 
short class is: 


V/tya/ + 1/t12/ 


RR = — 
V/ty4/ + 1/tye/ + 1/tea/ + 1/te2/ 


The values of RR and P matched when both IRT 
classes were reinforced equally. However, matching of 
RR and P seems only part of a general function which 
varies according to the absolute values of the short 
class of IRTs. Matching occurs when #,, is in the 
1-4-sec range, and departures occur at other values 
(Hawkes & Shimp, 1974). 

Hawkes and Shimp review the series of experiments 
in some detail. What emerges is the possibility of 9 
general function relating IRT reinforcement to be- 
havior, and that the function interacts systematically 
with other variables to determine response rate. As 
the work has expanded to deal with more than two 
IRT classes, it has come increasingly inte centact with 
what perhaps occurs under schedules of reinforcement 
not involving explicit IRT requirements (Shimp, 
1973), This way of considering IRT veinforcement 
might show correspondences between what happens 
when IRTs are specifically required and when they 
are adventitiously correlated with the reinforcing 
stimulus on interval and ratio schedules. 


Sequences as Response Units 


SEQUENCES AS [THEORETICAL 
RESPONSE UNITS 


Skinner (1938) speculated that, in fixed-ratio per- 
formance, the effective response unit might involve all 
of the behavior extending between successive rein- 
forcer presentation. Later, Mowrer and Jones (1945) 
reasoned that if ratios did in fact generate units com- 
posed of all of the responses, resistance to extinction 
should be a function of the number of units and a 
constant number of ratios would occur in extinction. 
Their failure, as well as Boren’s (1961) and Weissman 
and Crossman’s (1966), to obtain a perfect correspon- 
dence between the hypothesized units and number of 
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responseés in extinction seemed to oppose a response 
unit hypothesis. This failure could be due to the 
properties of extinction itself. Extinction is not a 
passive condition, but instead exerts effects of its 
own; these often include an initial intensification of 
responding prior to the well-known decrement (cf. 
Amsel, 1967; Morse, 1966). Also, extinction involves a 
change in stimulus presentations, and the way stimuli 
are presented in extinction can affect the number of 
responses emitted (Overmann & Denny, 1974). De- 
spite these complications, predictions of the response 
unit hypothesis of resistance to extinction, although 
not completely accurate, were not totally out of line. 
In the Mowrer and Jones study, for example, the num- 
ber of responses in extinction was linearly related to 
the number of responses previously required. 

Data using a different schedule support the re- 
sponse unit analysis. Day and Platt 1972) used a fixed 
constant number (FCN) schedule: rats had to emit a 
certain number of responses, but food was presented 
only if they approached the foed tray after thcy had 
made these responses. Early approaches reset the re- 
sponse requirement. The schedules used were FCN 1, 
FCN 8, and FCN 82, In terms of the number of re- 
sponses cmitted in cxtinction, these data replicated 
those of Mowrer and Jones. However, it had been 
pointed out by Denny, Wells, and Maatsch (1957) 
that the number of responses in the ratio may not be 
the appropriate measure of the response unit. Instead, 
it would be the sequence of responsés preceding each 
tray approach, and the important dependent variable 
in tériis of resistance to extinction would be the 
number of approaches. Day and Platt found that in 
extinction the groups did not differ in the number of 
such approaches, and they concluded that the response 
unit hypothesis was tenable. 

The stable sequences observed in successive com- 
ponents of second-order schedules involving fixed-ratio 
components fit a unitary response interpretation. In 
this research, a fixed-ratio sequence was treated as a 
single response unit, and it was reinforced according 
to some schedule. Findley (1962) used fixed-ratio com- 
ponents in a variety of complex schedules involving 
sequences of the same or of different responses and 
found that typical ratio patterns appeared in each 
component. Kelleher (1966) demonstrated that the 
ratio pattern was maintained when the first FR 20 
completed after 10 min—an FI 10-min (FR 20) second- 
order schedule—resulted in food presentation. Lee and 
Gollub (1971) found the same when ratio perfor- 
mance was followed by a reinforcer on a fixed-ratio 
schedule, and Kelleher, Fry, and Cook (1964) also 
observed characteristic ratio performance when the 
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fixed ratio was treated as a single response and was 
reinforced according to a DRL schedule. Marr (1971) 
has confirmed this finding with sequence schedules in 
which distinctive stimuli were correlated with each 
successive fixed-ratio component and the ratio per- 
formance produced the reinforcer according to either 
a fixed-interval or a fixed-ratio schedule. The com- 
ponent sequences resembled single responses in other 
ways. For example, Kelleher (1966) showed that with 
the FI 10-min (FR 20) second-order schedule the time 
taken to emit successive ratios shortened as the inter- 
val progressed. This can be compared with the short- 
ening of successive interresponse times, when single 
responses are reinforced according to fixed-interval 
schedules. Shull, Guilkey, and Witty (1972) also found 
substantial similarity in the emission of successive 
hxed ratios and successive individual responses under 
fixed-interval schedules. 


SEQUENCES AS CONDITIONABLE 
RESPONSE UNITS 


Are sequences reinforceable units—that is, is their 
probability of occurrence a function of their conse- 
quences? ‘Io determine that and thereby to determine 
the plausibility of sequences as theoretical response 
units, it is mecessary to impose the sequence as a 
formal unit and to observe whether it is conditionable. 
It is well to be cautious in concluding that a sequence 
1s actually required by a schedule. For example, a 
ratio schedule imposes a sequence of » responses as a 
formal requirement, and it demonstrates that such a 
sequence is conditionable (reinforceable). It does not 
show that the particular pattern according to which 
ratios are emitted is conditionable, because the sched- 
ule does not require any particular pattern. Interval 
schedules specify nothing at all about sequences, since 
they require only a single response. Many responses 
may occur in highly organized form, but interval 
schedules do not demonstrate that either total interval 
performance or any portion of it other than the single 
response is a conditionable response unit. 

Findley’s (1962) and Kelleher, Fry, and Cook's 
(1964) procedures demonstrate the necessary condi- 
tions for observing that an aspect of schedule per- 
formance is conditionable. In their experiments, a 
reinforcer was presented at the completion of a fixed 
ratio only if the postreinforcement pause exceeded a 
minimum duration. They found that the pause dura- 
tions changed in accordance with the requirements, 
so that pause behaved as a conditionable unit. These 
findings justify interpretation of the pause as a re- 
sponse unit on ordinary fixed-ratio schedules. 
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Fig. 13. Mean ratio time in seconds (initial pause time plus 
time spent responding) emitted by a pigeon as a function of 
the ratio time required for food presentation under an FR 30 
schedule. (From DeCasper & Zeiler, 1974; Zeiler, 1970, 1972.) 


This differential reinforcement procedure has been 
extended to other aspects of fixed-ratio performance. 
Figure 13 shows the effects of requiring that the 
duration of an entire ratio—i.c,, the time from the first 
opportunity to respond until the completion of the 
ratio—must exceed some particular value. Other ex- 
periments investigated the cffects of requiring that 
the duration be shorter than a specified value (Zeiler, 
1970, 1972b). In all of the experiments, emitted 
durations were related to required durations by the 
same power function applicable to other temporal 
differentiation procedures (see page 224) (DeCasper 
& Zeiler, 1974). The duration of an entire ratio, 
therefore, is a reinforceable aspect of behavior. Since 
the total time taken to execute a ratio—the initial 
pause time plus the time spent responding—belongs 
not to any single response, but instead describes 
a property of the entire sequence, the experiments 
demonstrated that the ratio sequence can be shown 
to have unitary properties. More precisely, they 
showed that duration is a reinforceable aspect of the 
sequence. In one experiment (Zeiler, 1972b) the most 
orderly aspect of behavior was shown to be the over- 
all duration of the ratios: the initial pause times and 
the rate of responding after the pause did not show 
equally predictable effects. The total sequence was 
more orderly than were either of the two major com- 
ponents. 
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Since initial pause duration and overall duration 
of fixed-ratio sequences are conditionable, they could 
be reinforced under ordinary fixed-ratio schedules. If 
so, they could play a role in producing the stereotypy 
in pausing and interreinforcer time often observed in 
fixed-ratio performance. They cannot, however, be 
important in the development of steady-state behav- 
ior, If they were, the pause and ratio durations ini- 
tially correlated with the reinforcer should tend to 
predominate. ‘This does not usually occur. Consider a 
pigeon changed from one fixed-ratio schedule to a 
larger one. Immediately following the transition the 
pause is short and the ratio duration is also short. 
Eventually both durations lengthen despite these early 
duration-reinforcer relations. Apparently, other vari- 
ables operate to establish the pause and ratio dura- 
tions, and then they may come to operate as indirect 
determinants of performance once they have become 
more or less stable. 

Figure 14 shows the frequency distributions of the 
duration of the pauses prior to the first response in 
the first 25 sessions of FR 30 after FR | training, Over 
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Fig. 14. Probability distribution of initial pause lengths (in 
seconds) in the first 25 sessions of an FR 30 schedule after an FR 
1 history with pigeons. Each session consisted of 30 ratios. 
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Fig. 15. Prdbability distribution of run-time lengths (seconds 


from the first to the Jast Tesponse) in the same seccsions 98 in 
Figure 14, 


thé esureé Sf thésa s&ssidiis, it appéaréd that those 
ause durations that occurred more frequently (d1G1y 
these that were most often followed 80 responses later 
by the reinforcing stimulus) became still more pre- 
dominant. This could be interpreted as indicating the 
differential reinforcement of particular pause dura- 
tions and the conséquent increase in probability. Simi- 
Jar data were obtained for overall ratio duration, A 
very different effect is revealed in Figure 15, which 
shows the distribution of run times (the time from the 
Ist to the 30th response) for the same 25 sessions. The 
shorter run times came to predominate, even though 
they were not the times correlated with reinforcer 
presentation in the earlier sessions. It is not likely, 
therefore, that a specific run time is being reinforced 
in the development of simple fixed-ratio performance. 
In conclusion, it is not clear whether the reinforce- 
ment of unitary properties operates to determine be- 
havior under fixed-ratio schedules. Perhaps pause time 
and ratio duration are directly remforced and run 
time is not. Or perhaps none of these aspects of per- 
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formance is reinforced directly but develop their 
characteristics for other reasons. It may be, as Dews 
(1970) suggested, that “when a subject is exposed re- 
peatedly to a consistent schedule, patterns of respond- 
ing may become sufficiently consistent to enable par- 
ticular aspects of the patterns themselves to be related 
reliably to the schedule. The very reliability of the 
relation may lead to the further strengthening of 
those particular aspects of the pattern” (p. 59). Per- 
haps unitary aspects of fixed-ratio performance do not 
play a role in establishing the final behavior, but do 
operate to maintain steady-state performance once it 
develops, 


SUMMARY ANB CONCLUBING REMARKS 


The Controlling Variables 


The preceding account of reinforcement schedules 
differs from those of Ferster and Skinner (1957) and 
Morsc (1966) only in the variables that are given ma- 
jor emphasis. The general approach is basically the 
same: the assumption is that schedule-controlled be- 
havior is multiply determined and that each hypothe- 
sized controlling variable must be studied directly to 
determine if it operates as expected. 

Response rate and patterning appear to be con- 
trolled by different variables. This view differs some- 
what from Ferster and Skinner's and Morse’s in that 
they tended to deal with both characteristics simul- 
tanéously. A major departure is the present emphasis 
on factors other than the quantitative properties of 
the response occurring the moment of reinforcer pres- 
cntation. 

There are two types of variables responsible for 
performance, the divect variables specified by the 
schedule and the indirect variables that derive trom 
performance. The variables have two types of effect, 
stercotypic and dynamic, differentiated by whether 
they tend to maintain the same behavior or to change 
performance. With respect to response rate, one sig- 
nificant factor is the particular response dependency 
that guarantees close temporal contiguity between a 
particular response class and the reinforcing stimulus. 
This variable operates directly under response-depen- 
dent schedules and indirectly under response-indepen- 
dent schedules, and it has the effect of maintaining or 
increasing the probability of the response that pre- 
cedes the reinforcer. This is the most molecular as- 
pect of the present approach. Other hypothesized 
variables encompass larger time periods and larger 
groups of responses. The number of responses per 
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reinforcer and interreinforcer time were considered to 
be major factors. Also, the availability of the rein- 
forcer when responding weakens may determine 
whether a given schedule will maintain many re- 
sponses per reinforcer presentation. Thus responding 
under a given schedule reflects the conjunction of the 
response dependency, number of responses per rein- 
forcer, interreinforcer time, and the regenerating 
power of the schedule. This account was able to deal 
with interval and ratio schedules and some complex 
schedules, but other complex schedules may involve 
additional variables. 

Patterning was explained by a single variable, the 
placement of the reinforcer in time. In interval sched- 
ules the tendency to respond at any instant is deter- 
mined by the location time relative to the overall in- 
terval duration. 

Schedules that specify certain events as prerequi- 
site for reinforcer presentation (e.g., a certain inter- 
response time or a certain behavioral sequence) reveal 
that such events can exert control over responding. 
Their role in the absence of such specification is un- 
clear. 


Some Reflections on Methodology 


The rationale for the methodology advocated here 
is that the effects of hypothesized variables must be 
assessed by studying those variables directly. This 
approach has been followed consistently. Thus when 
number of responses per reinforcer was proposed as a 
source of control in fixed-interval performance, its 
mode of operation was assessed by reference to fixed- 
ratio schedules, which control it diréctly. The same 
procedure was followed with every hypothesized con- 
trolling variable. The approach has its limitations 
and potential pitfalls. Inherent in it is the idea that 
variables have relatively simple effects and that com- 
plicated interactions do not occur. Consider, for ex- 
ample, the assessment of number of responses per 
reinforcer by the use of ratio schedules. As the num- 
ber (ratio size) is manipulated, interreinforcer time 
changes as well, Interreinforcer time is analyzed di- 
rectly in interval schedules, but here responses per 
reinforcer will vary. If, in fact, the two variables in- 
teract in some complicated way, the function of either 
cannot be easily ascertained. Other as yet unknown 
experimental designs will be necessary to evaluate 
such interactions. 

Another approach to the analysis of controlling 
variables is subtractive. A variable is prevented from 
operating (cf. Dews’s work on interrupting the hy- 
pothesized response chain in fixed-interval schedules), 
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and if the behavior is not disrupted, it is concluded 
that the variable is not important. This approach also 
requires caution. If a performance is actually over- 
determined—i.e., if the behavior under a given 
schedule arises because several variables operate inde- 
pendently to produce the same performance—the elimi- 
nation of one variable may not have a noticeable 
effect. Yet the variable may actually play a role. There 
are enough precedents (e.g., the sensory basis of maze 
learning, recovery of function in the brain) to warrant 
consideration of this possibility. 


Schedules As Fundamental 


Morse and Kelleher (1970) have maintained that 
schedules are fundamental determinants of behav- 
ior, because the particular events scheduled may be 
less important than is the schedule itself. In this sense, 
it is the schedule in relation to ongoing behavior that 
is fundamental in determining subsequent perfor- 
mance. There is another sense in which schedules are 
fundamental. If each schedule represents a particular 
conjunction of variables, the only way of arranging 
that conjunction is by establishing that schedule. 
Fixed-interval schedules establish (1) a response- 
reinforcer dependency involving a single response; (2) 
a nearly fixed interreinforcer time; (3) an unlimited 
range of possible number of responses per reinforcer 
In successive intervals; (4) a reinforcer following a 
single résponse when responding is net well main- 
tained; (8) presentation of the reinforcer at a fixed 
time since the beginning of the interval. Only the 
fixed-interval schedule specihés that all of these condi- 
tions will occur, Each variable may be analyzed, and 
fixed-interval performance may be understood as the 
outcome of these component évérits, but the schedule 
1s fundamental in that it alone can arrange these pre- 
cise interactions. To the extent that the precise inter 
action of multiple variables is responsible for a dis- 
tinctive performance, each schedule is a fundamental 
arrangement. 

The problems faced by the experimenter interested 
in understanding schedule performance are difficult 
because of the complexities of the relationships. How- 
ever, the fundamental role of schedules in psychology 
demands the attempt. It is impossible to study be- 
havior either in or outside the laboratory without en- 
countering a schedule of reinforcement: whenever be- 
havior is maintained by a reinforcing stimulus, some 
schedule is in effect and is exerting its characteristic 
influences. Only when there is a clear understanding 
of how schedules operate will it be possible to under- 
stand the effects of reinforcing stimuli on behavior. 
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Choice in 


Concurrent Schedules and 


a Quantitative Formulation 
of the Law of Effect’ 


INTRODUCTION 


Since the early 1960s there has been much operant 
conditioning research using continuous choice pro- 
cedures. Several factors suggest the utility of these 
procedures for quantifying the effects of reward and 
punishment on behavior, the law of effect. 

First, in continuous choice procedures (called con- 
current schedules) two or more alternative schedules 
of reinforcement are simultaneously available and the 
animal continually chooses between responding to 
one alternative or the other. ‘Thus the number of re- 
sponses or amount of time the animal allocates to each 
alternative during an experimental session may be 
considered a measure of its preference. In this way 
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James Wilkinson. I acknowledge my indebtedness to work by 
Peter Killeen for many of the points raised in the discussion, 
and to several graduate students in the Harvard University 
operant conditioning laboratory for permission to quote un- 
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choice can be used to quantify the relative reward 
value of different conditions of reinforcement. 

Second, rates of responding to each of two alterna- 
tives are far more sensitive to the frequency and mag- 
nitude of reinforcement for each alternative than are 
response rates in a single response situation (Catania, 
1963a; Herrnstein, 1961). 

The most persuasive argument for any measure of 
response strength is an orderly relation between that 
measure and the frequency, duration, or immediacy 
of reinforcement. Relative performance and reinforce- 
ment measures obtained from concurrent schedules 
show just such an orderly relation. Many studies have 
shown that a simple linear “matching” relation holds 
between relative response rates (or time distribution) 
and relative frequency (Herrnstein, 1961), magnitude 
(Brownstein, 1971; Catania, 1963a) and immediacy of 
reinforcement (Chung & Herrnstein, 1967). ‘This rela- 
tion is described by the following equations: 


Ry Ty ry ly 
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(1) 


where R, and R, are the number of responses per ses- 
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sion to each of the two alternatives, 7, and J, are 
the times spent responding on each schedule, and r, 
and rz are the frequencies of reinforcement for the 
alternative responses; 7, and 7, represent different 1m- 
mediacies (the reciprocal of the delay of reinforce- 
ment), and a, and a, different amounts of reinforce- 
ment for the two alternatives. From the matching 
relation, Herrnstein (1970) has formulated a powerful 
set of equations that describe the relation between 
response strength and reinforcement parameters in 
single-response situations as well as concurrent sched- 
ules. 

This chapter is largely concerned with an evalua- 
tion of Herrnstein’s equations as a quantification of 
the law of effect. After briefly describing concurrent 
schedule procedures, the chapter assesses the empirical 
basis and generality of the matching relation and its 
status as a general principle. Certain prerequisite 
conditions for matching are discussed, together with 
the role of procedural factors or more molecular proc- 
esses. Herrnstein’s (1970) quantitative formulation 1s 
then considered in detail—in particular, its ability to 
incorporate results of earlier runway studies in the 
tradition of Hull and Spence, and its relation to be- 
havioral contrast phenomena. Finally, alternative or 
more general quantitative models of choice in con- 
current schedules are discussed. 


CONCURRENT SCHEDULES 


Two different methods of programming concurrent 
schedules have generally been used. In one of these 
(Herrnstein, 1961), the animal switches back and forth 
between two spatially separated response keys or lev- 
ers, each associated with a different reinforcement 
schedule. In the second (Findley, 1958), the animal 
switches between two schedules programmed on the 
same key by responding on a second changeover (CO) 
key; each schedule is correlated with a different stim- 
ulus. The first method will be referred to as a two- 
key or two-lever concurrent, the second as a CO-key 
concurrent schedule. The difference between the two 
procedures is illustrated in Figure 1. 

In both procedures the two component schedules 
are programmed independently and continuously. It 
a reinforcement opportunity is programmed by a 
variable-interval (VI) schedule for response A while 
the animal is making response B, that reinforcement 
is held until the animal again makes response 4A. 
Consequently, the probability of reinforcement from 
one schedule increases with the time spent responding 
on the other schedule. 


SCHEDULE 


CONTROL SCHEDULE 1 


SCHEDULE 2 


SCHEDULE 
1 OR 2 


Fig. 1. Representation of two methods of programming con- 
current schedules. The pigeon on the left is responding on a 
changeover-key concurrent schedule (Findley, 1958), the pigeon 
on the right on a two-key concurrent schedule (Herrnstein, 
1961.) 


When equal concurrent interval schedules are pro- 
grammed in this way, rapid alternation between the 
two schedules is the dominant response pattern 
(Herrnstein, 1961; Skinner, 1950). ‘This is conducive to 
the development of concurrent superstitions (Catania, 
1966), the adventitious correlation of one response with 
reinforcement programmed for another. Responses to 
one alternative thus come partially under the control 
of the reinforcement schedule associated with the 
other. The best illustration of this comes from a study 
of one pigeon by Catania and Cutts (1963). Pecking at 
two keys was maintained by a concurrent VI I-min VI 
2-min schedule. Then the reinforcement for the VI 
2-min key was discontinued (extinction). The pigeon 
continued pecking at a steady though somewhat re- 
duced rate (about 15 responses per min) on the ex- 
tinction key throughout the 12 1-hr sessions. Catania 
and Cutts reported similar results from 13 human sub- 
jects pressing two buttons on a concurrent VI 30-sec 
EXT schedule. Most of the subjects responded at a 
substantial rate on the button associated with extinc- 
tion. Although never explicitly reinforced, responses 
on that button often occurred in close temporal prox- 
imity to reinforcement (increments of a counter) for 
the other button. Several early studies of concurrent 
schedules also found control of one response by the 
schedule for another (e.g., Sidman, 1958). Ferster and 
Skinner (1957, Chapter 13), for example, programmed 
a concurrent VI FI (fixed-interval) schedule, each 
schedule associated with a different key. On a single 
FI schedule almost no responding occurs early in the 
interval, but in the concurrent schedule considerable 
responding was maintained on the FI key early in the 
fixed interval, probably by accidental correlation with 
VI reinforcements for pecks on the other key. 
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To separate the two schedules a changeover delay 
(COD) is usually added to the concurrent schedule 
procedure (Herrnstein, 1961). ‘The COD specifies the 
minimum time interval that must elapse between a 
CO and a subsequent reinforced response. In a two- 
key (or two-lever) concurrent procedure the COD is 
usually timed from the first response on a given key 
(or lever) after a CO (Herrnstein, 1961); in CO-key 
procedures it is usually timed from responses on the 
CO key (Catania, 1966). Alternatively, the COD can 
be programmed from the last response on a key before 
a CO (Findley, 1958), but since this selectively rein- 
forces slow COs, the two other methods are preferred. 
Subsequent mention of the COD will refer to the first 
two methods of scheduling, unless otherwise specified. 

The COD therefore insures a separation in time 
between response A and the reinforcement of response 
B, preventing the adventitious reinforcement of AB 
sequences. Its effectiveness in this respect is illustrated 
by a condition of the Catania and Cutts (1963) pigeon 
experiment in which a l-sec COD was introduced. 
When the schedule for one key changed to extinction, 
responding on the extinction key rapidly declined to 
zero. Without the COD, responding on the extinction 
key continued at a steady rate. In the Catania and 
Cutts human experiment the introduction of a COD 
of between 2 and 15 sec in duration substantially re- 
duced and in many cases eliminated responding on 
the extinction button. 

The dominant patterns of responding on equal 
concurrent VI schedules with a COD shows runs of 
responses of roughly COD duration on one schedule 
alternating with runs of responses of similar duration 
on the other schedule. When unequal VI schedules 
are programmed concurrently, the duration of re- 
sponse runs maintained by the schedule programming 
more frequent reinforcement increases, and COs are 
less frequent (Catania, 1966). In the absence of the 
COD the switching response itself is a primary com- 
ponent of the behavior, and the main effect of changes 
in reinforcement is on switching; with a COD the 
main effect is on the rate of responding to each al- 
ternative. 


Measurement of Choice or Preference 


Response rates in concurrent schedules are usually 
calculated in terms of the number of responses made 
on each schedule divided by the total session time 
(minus time consumed by reinforcement)—i.e., over- 
all response rates. Rate of responding to each alterna- 
tive is calculated with respect to overall session time 
rather than the time the animal actually spends re- 
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sponding on each key because the concurrent sched- 
ules run continuously and the response alternatives 
are simultaneously available to the animal through- 
out the experimental session. Calculation of response 
rates with respect to the time that a given stimulus is 
in effect is more customary for multiple schedules, 
where each schedule runs and is available to the an- 
imal only in the presence of a particular stimulus. 
However, particularly in the CO-key concurrent sched- 
ule, the amount of time that the animal spends in the 
presence of the stimuli associated with each schedule 
can also readily be calculated. This enables the calcu- 
lation of local response rates—i.e., the number of re- 
sponses made on each schedule divided by the time 
spent responding on that schedule. 

The relation between response rates and reinforce- 
ment frequencies in concurrent schedules is usually 
considered in terms of relative measures (Herrnstein, 
1970): relative overall response rate [the number of 
responses made to one alternative divided by the 
total number of responses during the session—i.e., 
R,/(R;y + Ry)] and relative reinforcement frequency 
[71/(71 + 72)]. Similarly, relative time distribution is 
measured in terms of the time spent in the presence 
of the stimulus associated with one schedule divided 
by the total session time [T,/(T, + T.)]. In two key 
concurrent schedules, the time distribution is calcu- 
lated in terms of cumulative interchangeover time for 
each alternative. Exclusive preference for one alterna- 
tive is shown by a relative response rate or relative 
time distribution of 1.00 or .00, indifference between 
the alternatives by relative values of .50. The relation 
between ratios of responses (R,/R.) or of time (T,/T>) 
and ratios of obtained number of reinforcements 
(r;/T2) for each of the alternative has also frequently 
been examined (e.g., Baum, 1974a; Baum & Rachlin, 
1969; Staddon, 1968). 


THE MATCHING RELATION IN 
CONCURRENT VI SCHEDULES— 
REINFORCEMENT FREQUENCY 


In 1961 Herrnstein first demonstrated that when 
two independent VI schedules arranged reinforce- 
ments for concurrent responses and a COD was in 
effect, there was a matching relation between relative 
overall response rates and relative reinforcement fre- 
quency. Throughout his experiment the two VI sched- 
ules set an overall maximum rate of reinforcement at 
40 per hr, but the number of reinforcements allocated 
to each key was systematically varied. At all distribu- 
tions of the reinforcements, Herrnstein found that the 
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Fig. 2. The relative frequency of responding to one alterna- 
tive in a two-key concurrent VI VI schedule as a function of 
the relative frequency of reinforcement for that alternative. The 
diagonal line shows matching between the relative frequencies. 
(Data from three pigeons, Herrnstein, 1961.) 


pigeons’ relative response rates approximately equalled 
the relative reinforcement frequencies for the two 
alternatives; 


R, "4 
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where R is response rate, r is reinforcement frequency, 
and the subscripts denote the two alternatives. The 
data from the three pigeons in the experiment are 
shown in Figure 2. 


Response Matching 


Since Herrnstein’s 1961 study, matching between 
relative response rate and relative reinforcement fre- 
quency has been demonstrated in both kinds of con- 
current schedule procedure and for several different 
species. 

McSweeney (1975) reinforced pigeons’ treadle 
presses on several two-treadle concurrent VI VI sched- 
ules with a 2-sec COD. The rate of reinforcement for 
one alternative remained constant at 30 food presen- 
tations per hr; the rate of reinforcement for the other 


varied from 15 to 120 per hr. For each of four pigeons, 
relative rate of responding in each component sched- 
ule matched the relative frequency of food that the 
schedule provided. 

In an experiment by Baum (1972), a pigeon lived 
in the experimental situation. All of its food was 
obtained by pecking at two keys, each associated with 
a separate VI schedule. A 1.8-sec COD was in effect 
throughout the experiment. The bird was free to eat 
to satiation and one alternative alone was often suffi- 
cient to fulfill the bird’s normal food requirements. 
Distribution of responses between the alternatives was 
therefore often unnecessary, but the pigeon neverthe- 
less made thousands of responses on each key each 
day. For a wide range of relative reinforcement fre- 
quencies the proportion of pecks allocated to either 
key equaled the proportion of food obtained by pecks 
at that key. 

In a subsequent experiment, Baum (1974b) ex- 
tended the matching relation to wild pigeons in a 
more natural habitat. A version of the standard op- 
erant conditioning apparatus was placed in the attic 
of a wooden frame house in Cambridge, Massachu- 
setts. A flock of about 20 free-ranging wild pigeons 
that inhabit the attic were trained to peck at two 
keys for access to grain. A narrow perch in front of 
the keys allowed only one pigeon at a time access to 
the keys and food, but the pecks of the group were 
treated as an aggregate. Over a wide range of concur- 
rent VI VI schedules without a COD the pigeons’ pro- 
portion of pecks at a key approximately equaled the 
proportion of grain presentations obtained from it. 

Nevin (1969) used a two-key concurrent procedure, 
but presented the two keys simultaneously in discrete- 
choice trials. Independent concurrent VI schedules 
which ran during the intertrial interval as well as 
during the choice trials arranged reinforcements. 
When a reinforcer was scheduled for a key it was held 
and timing of the intervals in that schedule stopped 
until the next choice trial in which that key was 
chosen. ‘The first peck in a trial terminated the trial: 
if a reinforcer had been scheduled for that key the 
response produced 4-sec access to grain. Choice trials 
without a peck lasted for 2 sec. The intertrial interval 
(ITI) was 6 sec long, during which time the keys 
were dark. Pecks during the ITI extended it for 6 
sec from the last peck. The proportion of responses 
(choices) made by the pigeons to each key closely 
matched the proportion of reinforcements produced 
by each key in this discrete-trials procedure, as it does 
in concurrent VI schedules with continuous access to 
the keys. 

Schroeder and Holland (1969) reported a further, 
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perhaps more exotic, confirmation of the response- 
matching relation with human subjects. Their sub- 
jects were required to monitor deflections of pointers 
on four dials arranged in a square with two on the 
left and two on the right. An eye movement camera 
recorded macrosaccadic eye movements scanning each 
of the dials. A fixation on a dial after looking away 
toward the other dials counted as one response. Look- 
ing horizontally or diagonally between the two pairs 
of dials was defined as a changeover. A change in fixa- 
tion between two vertically arranged dials was there- 
fore a response but not a changeover, while a change 
in fixation between left- and right-hand dials was 
both. Pointer deflections were delivered to the two 
left-hand dials on one variable-time (VT) schedule 
and to the right-hand dials on a second, independent 
VT schedule. Scheduled deflections were assigned 
with an equal probability to the upper or lower dial 
on each side. Signal presentation was contingent on 
looking toward the side for which it was scheduled, 
but was independent of which of the two dials on 
that side was being fixated.1 When a short COD was 
programmed between crossover eye movements and 
signals, the pattern of scanning changed from fixating 
the four dials in succession or in a Z-shaped pattern to 
vertical scanning of the dials on either side with 
fewer crossovers. All reinforcements scheduled to 
occur before the COD timed out were held until the 
end of the COD and then delivered. With a 2.5-sec 
COD for one subject and 1.0-sec CODs for the others, 
all six subjects in the experiment matched relative 
scanning eye movement rates (number of fixations 
per min) on each side to the relative signal frequen- 
cies on each schedule. 


Matching of Both Responses and Time 


Catania (1963b) used a CO-key procedure so that 
time spent in each component could be accurately 
measured. With a 2-sec COD in effect, he found that 
pigeons approximately matched both the relative re- 
sponse rates and the relative amount of time spent in 
each component to the relative frequency of food. 
Similar results for pigeons were reported by Silberberg 
and Fantino (1970) using a two-key procedure, but 
with CODs varying between .88 and 3.5 sec. With 
rats as subjects and brain stimulation as the rein- 
forcer Shull and Pliskoff (1967) also found that relative 


1 Reinforcement presentation was thus independent of the 
vertical scanning eye movements, but the schedule was not 
strictly a VT schedule since the reinforcements were contingent 
on looking toward the appropriate side. 
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rate of lever pressing and time distribution matched 
the obtained distribution of reinforcements, but only 
when the COD was greater than 7.5 sec. 

A different method for programming a CO-key con- 
current schedule was used by Stubbs and Pliskoff 
(1969). One VI programmer arranged the reinforce- 
ments for both the schedules, each schedule being 
associated with a different color stimulus on the main 
response key. When a reinforcement was programmed, 
it was allocated to one schedule or the other according 
to different probabilities. "Thus whenever a reinforce- 
ment opportunity was arranged for one schedule, no 
further reinforcements could be arranged for either 
schedule until the available reinforcement had been 
obtained. ‘This procedure forces the subject to respond 
occasionally on both alternatives in order to obtain 
reinforcement for either alternative, even if one of the 
schedules is extremely unfavorable when compared 
with the other. The advantage of the Stubbs and 
Pliskoff procedure lies in insuring that the obtained 
relative reinforcement frequency must equal the 
scheduled relative frequency. On the other hand, the 
two alternative schedules are no longer independent 
of one another. 

Stubbs and Pliskoff found matching for both rela- 
tive response rates and relative time for three pigeons 
at five different relative reinforcement frequencies. In- 
creasing the COD from 2 sec, the value at which 
matching was first obtained, through 32 sec produced 
little systematic change in relative response rates when 
relative reinforcement frequency was kept constant at 
.75. In this procedure, responding on the two sched- 
ules is somewhat constrained since the subject has to 
respond on the less favorable schedule in order to 
maintain the overall frequency of reinforcement. 
However, this requirement by no means forces the 
subject toward matching, and might be expected to 
favor indifference. Nevertheless, excellent matching 
was obtained. 

In contrast to these results, Schmitt (1974) failed to 
find matching with humans in a conventional CO-key 
concurrent schedule procedure. Five subjects pressed 
a button for increments of a counter, each worth a 
number of cents, on one of two independent VI sched- 
ules each associated with a different stimulus light. 
They could change schedules by operating a toggle 
switch. A 1.5-sec COD stipulated the time that had to 
elapse after a changeover before a button press could 
be reinforced. In almost all of the experimental condi- 
tions relative response rates and relative time distribu- 
tion did not match the relative frequency of reinforce- 
ment, the departure usually being toward indifference. 
Apart from the difference in subjects, reasons for the 


238 CHOICE IN CONCURRENT SCHEDULES AND A QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


discrepancy between these results and the studies on 
nonhuman species are not readily apparent. 


Time Matching 


In most of these studies, local response rates were 
found to be the same for each schedule, and response 
matching resulted from the matching of relative time 
distribution to relative reinforcement frequency. 
Brownstein and Pliskoff (1968) and Baum and Rach- 
lin (1969) therefore argued that time and not response 
allocation underlies the matching relation. 

Brownstein and Pliskoff (1968) demonstrated that 
the relative time spent in either component of a CO- 
key concurrent schedule matched the relative fre- 
quency of food provided in that component even in 
the absence of any key pecking for the food. ‘he 
pigeons in their experiment changed the color of a 
stimulus light by pecking a single CO key, but the 
reinforcements in each component were delivered 
independently of the birds’ behavior according to 
different VT schedules. The pigeons therefore chose 
between two different frequencies of food delivery 
each correlated with a different stimulus color. In this 
situation they matched the proportion of the session 
time spent in the presence of a color to the proportion 
of reinforcements associated with that stimulus: 
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Baum and Rachlin (1969) pointed out that pigeons 
tend to peck at a constant rate when they are respond- 
ing, with the majority of interresponse times (IR'T's) 
falling between .3 and .5 sec (Blough, 1963). Long- 
term response rate varies with the duration of pauses 
between bursts of responses at the constant rate. With 
such a constant rate of responding, time spent re- 
sponding determines the number of responses. Hence, 
Baum and Rachlin argued, time spent responding 1s 
the most general measure of response frequency for 
repetitive responses like key pecking or lever pressing. 

Baum and Rachlin studied a response that can only 
be measured in terms of time spent responding: stand- 
ing in a particular location. Standing on one or the 
other side of an experimental chamber was reinforced 
on two concurrent VI schedules. A 4.25-sec GOD was 
signaled by the illumination of a white light in the 
chamber, during which time no reinforcement was 
presented. Post-COD standing on either side of the 
chamber was correlated with the illumination of 
either a red or green stimulus light, each associated 
with a different frequency of food presentation. When 


Baum and Rachlin plotted the results of this experi- 
ment in terms of Equation 3—i.e., the relative time 
spent in the presence of either the red or green stim- 
ulus—the data points from most of the pigeons system- 
atically fell below the matching diagonal in a bowed 
curve. However, they found that the results could be 
expressed in terms of the following equation: 


n =e (4) 


The ratio of the times spent on the two sides of the 
chamber was directly proportional to the ratio of the 
rates of reinforcement provided on the two sides. 
When the logarithms of the time ratios are plotted 
against the logarithms of the reinforcement ratios, 
Equation 4 specifies a linear function of the form 
log (T,/T.2) = 1.00 log (r,/r2) + log k, where 1.00 is 
the slope and log k& the intercept of the function on 
the Y-axis when log (r,/72.) = 0. The data from one of 
Baum and Rachlin’s pigeons is shown in Figure 3, 
plotted in terms of both Equation 3 and the logarith- 
mic form of Equation 4. The linear function of least 
squares fit to the log data is shown in the lower panel 
together with the percentage of the variance in the 
dependent variable accounted for by the function. 
The value of & is given by the antilog of the intercept; 
in this case k = .54. The slope of the function approx- 
imates 1.00 as specified by Equation 4. 

When k = 1.0, Equation 4 is identical to the 
matching relation described by Equation 3, A k-value 
different from 1.0 signifies a constant proportional 
bias toward one side of the chamber or toward one 
schedule. Such a bias shows up as a constant displace- 
ment from the matching diagonal when the logs of 
the ratios are plotted against one another, as in the 
lower panel of Figure 3. Hence Equation 4 (propor- 
tional ratio matching) and its response equivalent, 


es (5) 


are more general than Equations 2 and 3 (matching 
of relative proportions). ‘They account for such factors 
as position or color preferences, or even preferences 
arising from qualitatively different reinforcers (see p. 
252). The bias parameter k takes account of our im- 
perfect knowledge of the reinforcers at work in the 
experimental situation. For example, in the Baum 
and Rachlin (1969) study there were two feeders. If 
they did not yield equal quantities of food, k 4 1.0 to 
the extent that the units of r, and rz were not equiv- 
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Fig. 3. Data from one pigeon in Baum and Rachlin’s (1969) 
shuttlebox choice procedure plotted in terms of relative time 
and reinforcement distributions (top panel) and in terms of 
the logarithms of the time and reinforcement ratios (bottom 
panel). ‘The heavy diagonal line represents perfect matching; the 
fine diagonal line plotted through the data points is the least- 
squares regression line for those points. The regression function 
and the percentage of data variance that it accounts for is 
given in the lower panel. 


alent. Other biases can also be considered as rein- 
forcers not identified in the independent variable. 
Baum (1974a) has therefore suggested that choice in 
concurrent schedules is best described by the ratio of 
responses or times spent responding as some function 
of the ratio of reinforcements. Where relative pro- 
portion matching is obtained, proportional ratio 
matching must also hold, but the converse is not 
necessarily true. 
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Baum (1975) recently reported time matching for 
humans in a vigilance task. Subjects were required to 
monitor red or green signals projected on a trans- 
lucent plastic screen. The signals were arranged ac- 
cording to a single VI schedule and assigned as red 
or green with varying probabilities (cf. Stubbs and 
Pliskoff, 1969). By pressing down one of two telegraph 
keys the subjects illuminated the screen with either a 
green (left key) or red (right key) floodlight for the 
duration of the press, enabling them to detect the red 
or green signals. (Opposite colors made the signals 
visible.) As long as the appropriate floodlight was on, 
the signal remained on until it was turned off by 
pushing a button next to the depressed key. ‘Turning 
off a signal incremented a “score” counter by 1. To 
reduce subjects’ changeover rates a 2-sec COD was in 
effect and a response cost was programmed on a vari- 
able-ratio (VR) 3 for release of a key after it had been 
depressed. The response cost consisted of the incre- 
ment by 1 of a second counter, the final tally of which 
was subtracted from the “score’’ counter to determine 
the overall session score. Pilot studies had indicated 
that the response cost was necessary to prevent sub- 
jects from simply alternating as rapidly as possible be- 
tween the two keys. 

To maintain the subjects’ interest throughout the 
experiment it was given a gamelike appearance. As 
captain of a spaceship under enemy siege, the subject 
defended himself by detecting and destroying two 
types of enemy missiles: red ones and green ones. The 
colored floodlights represented the appropriate sen- 
sors; the response cost represented a “hit” by the 
enemy when a sensor was “deactivated.” Subjects 
competed for a monetary bonus given for the highest 
session score (missiles detected minus “‘hits”) for each 
block of five sessions. 

With a 2-sec COD and a VR 3 response cost, two of 
the three subjects showed excellent matching be- 
tween the ratio of the times for which each key was 
depressed and ratio of detections of each type of 
signal. The third subject was nearer indifference than 
predicted by matching, but when the response cost in- 
creased to a “hit” for every release, changeover rate 
declined and excellent matching was obtained over a 
wide range of signal ratios. 


Assessment of the Empirical Evidence for Matching 


How strong a generalization is the matching rela- 
tion from the data just reviewed? Proportional ratio 
matching (Equations 4 and 5) specifies that the regres- 
sion line relating log response or time ratios to log 
reinforcement ratio have a slope of 1.0. A slope greater 


240 CHOICE IN CONCURRENT SCHEDULES AND A QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


Table 1 Percentage of data variance accounted for by least-squares regression lines and best fit line of 1.0 slope relating 
log response ratio to log reinforcement ratio for both individual subjects and groups 


BEST FIT LINE NO. OF 
INDIVIDUAL DATA LEAST-SQUARES LINE OF 1.0 SLOPE POINTS COD 


a a a ee 
Herrnstein (1961) 


P23] 1.36X — .04 (99.07) 1.00X — .12 (92.2%) 5 1.5 sec 
P055 80X — .01 (98.3%) 1.00X + .01 (92.3%) 5 1.5 sec 
Catania (1963b) 

P117 83X — .01 (90.5%) 1.00X — .02 (86.2%) 8 2 sec 
P243 77X + 01 (92.9%) 1.00X — .03 (87.2%) 8 2 sec 
P294. 80X + .03 (90.3%) 1.00X (84.87) 8 2 sec 
Stubbs & Pliskoff (1969) 

P103 86X + .08 (96.0%) 1.00X + .04 (93.7%) 4 2 sec 
P104 1.24X — .01 (92.6%) 1.00X + .06 (89.1%) 4 2 sec 
P108 94X + .02 (98.4%) 1.00X + .01 (98.0%) 4 2 sec 


Silberberg & Fantino (1970) 


A 4X — .05 (99.97%) 1.00X — .06 (87.8%) 3 .88 sec 
E 77X — .06 (99.7%) 1.00X — .06 (91.0%) 3 .88 sec 
C 94X + .07 (48.0%) 1.00X + .04 (42.0%) 3 1.75 sec 
B 55X — .18 (61.0%) 1.00X + .10 (20.5%) 3 1.75 sec 
G -63X — .05 (99.0%) 1.00X + .10 (66.1%) 3 3.5 sec 
H 93X + .17 (99.8%) 1.00X + .16 (99.2%) 3 3.5: SEC 
Trevitt, Davison, & Williams (1972) 
P10] 2X + 02 (94.97) 1.00X — .04 (10.9%) 4 3 sec 
P102 1.01X — .15 (98.8%) 1.00X — .15 (98.5%) 4 3 sec 
P105 61X —.17 (91.5%)  1.00X—.21 (658%) 4 3 sec 
P106 73X — .04 (97.9%) 1.00X — .07 (85.5%) 4 3 sec 
Baum (1972) 
A -‘95X — .01 (99.4%) 1.00X — .01 (99.2%) 5 1.8 sec 
McSweeney (1975) 
8429 4X + .03 (93.8%) 1.00X — .04 (89.67) 4 2 sec 
8772 89X —.01 (86.8%)  1.00X — .04 (85.6%) 4 2 sec 
8845 1.01X — .03 (95.3%)  1.00X — .03 (95.1%) 4 2 sec 
8927 79X — 02 (84.2%) — 1.00X — .08 (83.7%) 4 2 sec 
GROUP DATA 
ferrnstein (1961) 1.11X — .02 (92.3%) 1.00X — .03 (91.4%) 12 1.5 sec 
Catania (1963b) 80X + .01 (90.7%) 1.00.X — .02 (86.3%) 24 2 sec 
Stubbs & Plishoff (1969) 1.01X + .03 (92.5%) 1.00X 4+ .04 (90.4%) 12 2 sec 
Silberberg & Fantino (1970) 85.X + .04 (92.9%) 1.00X + .05. (90.0%) 18 .88-3.5 sec 
Trevitt, Davison, & Williams (1972) .76X — .19 (88.7%)  1.00X — .12 (80.7%) 16 3 sec 
Baum (1974b) 1.03X + .05 (99.3%) 1.00X + .04 (99.1%) 5 no COD 
McSweeney (1975) 85X — .01 (87.2%) 1.00X — .02 (84.2%) 16 2 sec 


a ee 
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Table 2 


log time ratio to log reinforcement ratio for both individual subjects and groups 


INDIVIDUAL DATA 


Catania (1963b) 
P117 
P243 
P294 


Brownstein & Pliskoff (1968) 
P93 


Baum & Rachlin (1969) 
P334 
P360 
P488 
P489 
P490 
P496 


Stubbs & Pliskoff (1969) 
P103 
P104 
P108 


Silberberg & Fantino (1970) 


rotor > 


Trevitt, Davison, & Williams (1972) 
P101 
P102 
P105 
P106 


Baum (1975) 
Doug 

Noa 

John I 

John II 


GROUP DATA 


Catania (1963b) 

Brownstein & Pliskoff (1968) 

Baum & Rachlin (1969) 

Stubbs & Pliskoff (1969) 

Silberberg & Fantino (1970) 
Trevitt, Davison, & Williams (1972) 
Baum (1975) 
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Percentage of data variance accounted for by least-squares regression lines and best fit line of 1.0 slope relating 


BEST FIT LINE NO. OF 
LEAST-SQUARES LINE OF 1.0 SLOPE POINTS COD 
94X — .04 (94.8%) 1.00.X — .05 (94.5%) 8 2 sec 
76X — .06 (86.0%) 1.00X + .02 (77.2%) 8 2 sec 
94X — .04 (93.4%) 1.00.X — .05 (93.1%) 8 2 sec 

1.04X + .01 (98.6%) 1.00X + .01 (98.5%) 5 2 sec 
B84X — .25 (97.3%) 1.004 — .24 (92.5%) I] 4.25 sec 
.63X — 12 (98.1%) 1.004 — .11 (64.4%) 1] 4.25 sec 

1.09X — .29 (96.5%) 1.00X — .29 (95.4%) 1] 4.25 sec 
1.15X — .49 (90.3%) 1.004 — .49 (88.8%) 1] 4.25 sec 
1.29X — .06 (96.3%) 1.00.X — .08 (91.5%) 1] 4.25 sec 
98.X — .27 (98.5%) 1.004 — .26 (97.9%) 1] 4.25 sec 
1.03X + .11 (97.5%) 1.00X + .12 (97.4%) 4 2 sec 
1.24X + .00 (92.8%) 1.00X + .07 (89.4%) 4 2 sec 
1.07X + .01 (99.0%) 1.00.X + .03 (98.5%) 4 2 Sec 
87X — .21 (99.7%) 1.00.X — .21 (97.5%) 3 .88 sec 
1.06X — .18 (96.2%) 1.00X — .18 (95.9%) 3 .88 sec 
8X + .45 (99.8%) 1.00.X + .21 (48.6%) i, 1.75 sec 
84X — .12 (98.7%) 1.00X — .02 (90.2%) 3 1.75 sec 
I7X — .08 (96.8%) 1.00X — .06 (96.6%) 3 3.5 sec 
1.22X — .06 (96.3%) 1.00X — .01 (95.6%) 3 3.5 sec 
60X — .02 (94.5%) 1.00.X — .06 (35.0%) 4 3 Sec 
1.11X — .08 (95.9%) 1.00.X — .08 (95.0%) 4 3 sec 
70X — .10 (94.0%) 1.00.X — .12 (80.9%) 4 3 sec 
8X — .08 (98.2%) 1.00.X — .08 (98.1%) 4 3 Sec 
1.16X — .08 (90.6%) 1.00X — .08 (82.8%) 10 2 sec 
98X + .03 (93.3%) 1.00X + .04 (92.6%) 10 2 sec 
67X (96.1%) 1.00X + .01 (70.2%) 10 2 sec 
94X + .15 (93.7%) 1.00X + .15 (93.3%) 1 2 sec 
89X — .01 (90.2%) 1.00X — .02 (88.7%) 24 2 sec 
94X + .02 (97.5%) 1.00X + .02 (96.6%) {2 2-7.5 sec 
1.01LX — .24 (89.4%) 1.00.X — .24 (89.0%) 66 4.25 sec 
1.11X + .04 (95.0%) 1.00.X ++ .07 (94.0%) 12 2 sec 
1.07X — .05 (93.9%) 1.00X — .05 (93.4%) 18 .88-3.5 sec 
88X — .08 (91.0%) 1.00. — .09 (89.4%) 16 3 Sec 
I93.X (87.6%) 1.00X (85.6%) 30 2 sec 
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than 1.0 represents overmatching, a stronger prefer- 
ence for the schedule providing the more frequent 
reinforcement than that predicted by matching. A 
slope less than 1.0 represents undermatching, a weakey 
preference for the richer schedule than that predicted. 
In the following analysis a least-squares line of best 
fit was calculated for the data from all of the pub- 
lished experiments on concurrent VI VI schedules in 
which at least three different ratios of reinforcement 
frequency were studied. Data points showing exclusive 
preference for one or other schedule where the rein- 
forcement ratio was either 0 or oo were excluded from 
the analysis, since neither of these can be expressed 
as a logarithm, although they are in keeping with the 
matching relation. To evaluate Equations 4 and 5 the 
percentage of data variance in the dependent variable 
accounted for by proportional ratio matching—i.e., by 
a line of 1.0 slope—was also calculated in each case. 
This analysis is shown in Table | for response ratios 
for both individual and group data. The group regres- 
sion lines were calculated from all the data points 
provided by the subjects in that experiment. ‘Table 2 
shows the same analysis for time ratios.” 

The matching relation accounts for over 80% of 
the data variance for 18 of the 23 individual subjects 
in Table 1 (response ratios), and for 22 of the 27 sub- 
jects in Table 2 (time ratios). It fails to account for a 
substantial proportion of the data for only 4 subjects, 
? in the Trevitt, Davison, and Williams (1972) study 
and 2 in the Silberberg and Fantino (1970) study. In 
fact, for the response ratios of the latter 2 pigeons, 
even the least-squares line accounts for little of the 
variance. A possible explanation for the marked 
deviation of these subjects from matching is con- 
sidered later in this section. 

Group data are important, especially for the studies 
in which only a few points were obtained for each 
subject. Here the matching relation accounts for over 
80% of the variance in response ratios and for over 
85% of the variance in time ratios for all the experi- 
ments. In all cases, the matching line is less than 8% 
worse than the regression line of least-squares fit. 

Nevertheless, an important point in assessing the 
matching formulation is whether the deviations from 
matching are systematic. For time ratios (Table 2) 
there is no systematic deviation. The individual re- 
gression lines vary equally on both sides of matching, 
with a median slope of .98. For response ratios (Table 
1) the slopes of the individual regression lines do tend 
toward undermatching, with a median slope of .80. 


2In calculating the group regression line for Baum’s (1975) 
data, the results from John I rather than John JI were used, 
since the experimental parameters were the same for John I and 
the other two subjects. 


However, several] methodological considerations must 
be taken into account in evaluating the data from 
some of the studies. 

The strongest evidence for systematic undermatch- 
ing comes from Trevitt et al. (1972). In this experi- 
ment four different concurrent VI VI values were 
studied in the course of an experiment on choice in 
two-key concurrent VI FI schedules. All of the data 
points for the concurrent VI VI schedule were ob- 
tained after long exposure to the different VI FI 
values, five VI FI conditions being studied before the 
first VI VI. In both the VI FI and VI VI conditions, 
both keys were illuminated with white light, and the 
FI schedule was always associated with the same key. 
During the VI VI procedure three of the four pigeons 
showed a bias toward the key that had been correlated 
with the VI schedule and preferred throughout the 
concurrent VI FI. All four pigeons showed regression 
lines for response ratios with very similar slopes to 
those obtained for them in the VI FI conditions. Both 
Trevitt et al. and Nevin (1971) demonstrated that re- 
gression lines relating response ratios to obtained 
reinforcement ratios on concurrent VI FI schedules 
have slopes of considerably less than 1.0. The prior 
exposure to concurrent VI FI schedules could there- 
fore have affected the VI VI data in this study. 

The Silberberg and Fantino (1970) study raises 
another consideration: control for order effects. ‘The 
two pigeons that showed particularly marked under- 
matching, subjects B and C, were exposed’ to only 
three different relative reinforcement frequencies, 
with the same key providing more frequent reinforce- 
ment in each case. Subject B was successively exposed 
to relative reinforcement frequencies of .33, .20, and 
.11, subject C to relative frequencies of .67, .90, and 
.88. Any order effects in which the previous reinforce- 
ment conditions affected responding on the new rein- 
forcement schedules (so-called hysteresis effects—Baum, 
1974a; Stevens, 1957) could have produced the flat 
regression lines for the pigeons.® 

Another major factor to be taken into account is 
the role of the COD in matching. Many studies 
(Brownstein & Pliskoff, 1968; Herrnstein, 1961; Shull 
& Pliskoff, 1967) have shown that a minimum COD 
duration is necessary for matching to be obtained, but 


3 Undermatching in terms of a slope <I1.0 does not neces- 
sarily mean that the subject is nearer indifference between the 
schedules than is predicted by the matching relation. A subject 
could actually respond more on the preferred key than matching 
predicts, yet still produce a regression line of slope <1.0. For 
example, if relative reinforcement frequencies of .50, .60, .70, and 
.80 were studied (i.e., if the same key always provided the more 
frequent reinforcement), relative response rates of .58, .66, .74, 
and .82 would produce a regression line with a slope flatter than 
1.0. This was the case for the two pigeons that produced the 
flattest functions in the study by Silberberg and Fantino (1970). 
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there is no particular reason to believe that the same 
minimum COD value will suffice for all subjects. Just 
as there are individual and species differences in many 
discrimination tasks, it is likely that different COD 
values will be needed for individual subjects before 
they properly discriminate the two reinforcement 
schedules in a concurrent procedure. For example, a 
considerably longer minimum COD of 5 to 10 sec is 
apparently needed for rats (de Villiers & Millenson, 
1972; Shull & Pliskoff, 1967) than for pigeons (about 1] 
to 3 sec—Catania, 1966; Herrnstein, 1961). Brownstein 
and Pliskoff (1968) reported that different COD values 
(between 2 and 7.5 sec) were needed to obtain match- 
ing for each of their three pigeons, and Schroeder and 
Holland (1969) needed a longer COD for one of their 
human subjects before matching was observed. 

At all COD durations less than the minimum re- 
quired for matching the subjects are nearer indiffer- 
ence between the two schedules and show a flatter 
function relating response and reinforcement ratios 
than that predicted by the matching relation—i.e., 
undermatching. Yet most studies have programmed 
the same short COD for all subjects, usually between 
1 and 3 sec in duration. Therefore, it is not surprising 
that in most studies some subjects show regression 
lines with slopes less than 1.0. It is worth noting that 
only one of ‘the studies shown in Tables 1 and 2 
(Brownstein & Pliskoff, 1968) varied the COD duration 
for individual subjects until matching was obtained 
at the first relative reinforcement frequency. The only 
subject in that experiment for which three different 
relative reinforcement rates (besides 1.0) were studied 
produced a regression line relating time and rein- 
forcement ratios with a slope of 1.04. The regression 
line for the group of three subjects, including all the 
data points, was Y = .94X + .02 (with 97.5% of the 
data variance accounted for)—close to perfect 
matching. 

The matching relation, therefore, holds only under 
certain conditions. A COD of sufficient duration must 
be used for each subject, and the reinforcement sched- 
ules should be run in balanced order across the two 
keys to obviate order effects. Many of the studies cited 
in Tables 1 and 2 did not satisfy these conditions. 
Since most of the known factors that lead to systematic 
deviations from matching lead to undermatching 
(Baum, 1974a), this outcome will be obtained in ex- 
periments that fail to control for them. 


The Role of the COD in Matching 


The importance of the COD in the matching rela- 
tion raises questions about the generality of matching. 
Indeed, it could be argued that the dependency of 
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matching on a minimum COD duration severely 
limits the generality of the principle. This would fol- 
low if matching occurred only at certain arbitrary 
COD values. In fact, although different minimum 
COD values may be required for different species and 
even for different individual subjects, matching is 
found for all values of the COD greater than this 
minimum (Allison & Lloyd, 1971; Shull & Pliskoff, 
1967; Stubbs & Pliskoff, 1969). ‘The matching relation 
is therefore not an artifact of any particular COD 
duration, although it requires a minimum separation 
of the two schedules in time. 

On the other hand, Pliskoff (1971) has argued that 
response or time distributions may themselves be by- 
products of changeover (CO) responding, which de- 
pends on both the COD duration and the relative 
reinforcement frequency. With a fixed COD duration, 
CO rate decreases as the relative reinforcement fre- 
quency diverges from .50; and with a fixed relative 
reinforcement frequency, CO rate decreases as the 
COD increases (Shull & Pliskoff, 1967; Stubbs & 
Pliskoff, 1969). But the results of Stubbs and Pliskoff’s 
(1969) experiment do not support Pliskoff’s conclusion 
that matching is determined by CO responding. They 
fixed the relative reinforcement frequency at .75, and 
although overall CO rate declined systematically with 
increasing COD duration, relative response rate and 
time allocation did not change. Matching was ob- 
tained at all COD values from 2 through 32 sec. This 
suggests that relative reinforcement frequency is the 
crucial variable in determining response and time 
matching. 

Nevertheless, the role of the minimum COD values 
required for matching in concurrent schedules has 
still to be clarified. Catania (1966) and Herrnstein 
(1961, 1970) suggest that the COD separates the two 
schedules in time and so reduces the adventitious 
reinforcement of left-right or right-left response se- 
quences. As mentioned earlier, Catania and Cutts 
(1963) showed that without a COD, responses to one 
alternative came partially under the control of the 
other reinforcement schedule. One function of the 
COD is therefore to prevent such concurrent super- 
stitions by introducing a delay between a response to 
one alternative and the reinforcement from the other. 

Pliskoff (1971) has suggested that the COD func- 
tions to punish the CO response since it specifies a 
period of time during which no response will be rein- 
forced if the subject changes over. By thus decreasing 
CO rate, the COD separates the two schedules. Delays 
of increasingly longer duration produce larger decre- 
ments in CO rate, suggesting that increasing the COD 
may be comparable to increasing shock intensity in a 
punishment paradigm. 
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There are certain similarities between perfor- 
mances on concurrent schedules with CODs and those 
in which the CO response is explicitly punished by a 
shock or by a time-out from reinforcement. Todorov 
(1971) demonstrated that CO rate decreased and rela- 
tive response rate increased as the intensity of shock 
or duration of time-out increased, as they do with in- 
creasing COD duration (Allison & Lloyd, 1971; Shull 
& Pliskoff, 1967). 

However, the parallel breaks down under further 
scrutiny. Shull and Pliskoff (1967) cbserved that at 
all COD durations, relative response rates and time 
distributions continued to match the relative obtained 
reinforcement frequencies, and the local response rates 
tended to be the same on both schedules. On the other 
hand, Todorov (1971) found that with increasing 
punishment of the CO response, the relative rate of 
responding increased more rapidly than the relative 
time distribution, and the local rates of responding 
deviated more and more from equality. Relative rein- 
forcement frequency did not change significantly as 
punishment increased, so that neither response nor 
time matching were obtained at time-out durations 
longer than 3 sec or at shock intensities higher than 
4 mA. 

In another procedure, Stubbs and Pliskoff (1969) 
programmed a fixed-ratio (FR) requirement of 20 
responses on the CO key while the main key was 
darkened and the VI programmers stopped—i.e., the 
FR functioned like Todorov’s time-out punishment. 
With the FR requirement in effect, relative response 
rate again overmatched the distribution of reinforce- 
ments and local response rates deviated from equality. 
These two experiments therefore fail to demonstrate 
any functional equivalence between the GOD and 
direct punishment of the CO response. 

It is nevertheless questionable whether these pro- 
cedures constitute an adequate test of the hypothesis 
that the COD punishes changeovers. Since the rein- 
forcement schedules did not run during the timeout 
or the FR contingent on the CO response, the 
punishers interacted with the VI schedules in a way 
that the COD does not. Similarly, the shock used as 
a punisher by ‘Todorov could reduce the relative 
value of the food reinforcement (de Villiers & Millen- 
son, 1972; Millenson & de Villiers, 1972). Pliskoff’s 
(1971) punishment hypothesis therefore remains as a 
possible explanation of the COD’s effects on respond- 
ing. 

Silberberg and Fantino (1970) proposed that the 
COD has a more complex role in the matching rela- 
tion than simply separating the two schedules in time. 
They reported that response rates during the COD 
period following a CO were considerably higher than 


post-COD response rates. Relative response rates 
within the COD approximated .50 (indifference) 
while post-COD response rates overmatched the over- 
all frequency of reinforcement. Only when these two 
response rates were added did overall relative re- 
sponse rate closely match relative reinforcement fre- 
quency. From these results, Silberberg and Fantino 
concluded that the matching relation depends on the 
interaction of COD and post-COD response patterns 
and on the perseverance of the COD response burst 
into the post-COD period on the preferred key. Since 
the probability of reinforcement on one VI schedule 
increases the longer the subject spends responding on 
the other schedule, they suggested that the high re- 
sponse rates during the COD on both keys reflect the 
increased local probability of reinforcement imme- 
diately after a CO. 

Pliskoff (1971) similarly observed that with equal 
concurrent VI schedules, the response rate during the 
COD was higher than the post-COD response rate at 
all COD durations between .33 and 27 sec. But 
Pliskoff found that response rate during the COD was 
highest with a l-sec COD and then declined as the 
COD increased, whereas the post-COD response rate 
was fairly constant at all COD values. Unfortunately, 
the same analysis has not been performed when rela- 
tive reinforcement frequencies varied, but it suggests 
that the matching relation is not an artifact of the 
COD. ‘The difference between COD and post-COD 
response rate decreases considerably as the COD 
lengthens, so response bursts during the COD must 
contribute a different relative amount to the overall 
rate of responding as the COD varies. Yet the match- 
ing of overall relative response rate to the relative fre- 
quency of reinforcement is maintained at all these 
COD durations (Shull & Pliskoff, 1967; Stubbs & 
Pliskoff, 1969). 

Other results suggest that the high response rates 
generated during the COD are unnecessary for match- 
ing if the CO rate is low enough to separate the sched- 
ules in time. Kulli (unpublished data) showed that 
pigeons’ CO rates systematically decreased with in- 
creasing body weight. Even without a COD, relative 
response rates matched relative reinforcement fre- 
quency when the birds were at 100 to 110% of their 
preexperiment ad libitum weights. 

In brief, there is substantial evidence that a certain 
amount of separation and differentiation between the 
two reinforcement conditions is a necessary condition 
for matching in concurrent schedules and that this 
separation can be produced by several methods that 
reduce CO rate. However, any role played by the 
COD beyond merely producing this separation is not 
clear (cf. Kulli’s data). 
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MAXIMIZING OR MATCHING 


Shimp (1966, 1969a) has argued that matching is 
not fundamental, but is produced by more molecular 
interaction between choices and the probability of 
reinforcement. He scheduled reinforcements probabil- 
istically for choices in a discrete-trial procedure 
(Shimp, 1966). A contingency similar to that in a VI 
schedule was employed in that a reinforcement pro- 
grammed for a particular choice remained available 
until produced by the subject. The probability of 
reinforcement on one key therefore increased while 
the bird responded on the other key. Shimp found 
that both the initial postreinforcement choices and 
the sequential changes in choice probabilities between 
reinforcements corresponded to the differences in 
probability of reinforcement for each choice arranged 
by the schedule. He concluded that the overall match- 
ing found in concurrent VI schedules is a by-product 
of the subjects’ tendency to maximize—i.e., to choose 
the alternative with the higher momentary probability 
of reinforcement on each choice trial. 

However, discrete-trial experiments by Nevin 
(1969) and Herrnstein (unpublished data), using con- 
current VI schedules, produced contrary results. Both 
Nevin and Herrnstein obtained matching, but sequen- 
tial changes in choice probabilities and postreinforce- 
ment choices did not correspond to the momentary 
probability of reinforcement on each key. Nevin did 
find that choices were determined by changes in the 
relative frequency of reinforcement within sequences 
of trials between reinforcements, but as shown in 
Figure 4, Herrnstein found no relation between the 
pigeons’ choices and the changing relative probability 
of reinforcement. Momentary maximizing is there- 
fore not necessary for matching in concurrent VI 
schedules. 

It is an empirical question whether matching can 
be explained by some combination of more molecular 
processes, but as Herrnstein (1970) has pointed out, 
there is no logical reason to assume that the matching 
relation must be explained at a molecular level. 
Choice behavior in concurrent schedules may be more 
orderly at the level of the matching equation than 
at the level of local response rate variations, sequences 
of choices, or relative changeover frequencies. 

Nevertheless, the matching relation can perhaps be 
incorporated into a more general model of maximiz- 
ing payoff per response. Indeed, Herrnstein and Love- 
land (1975) have argued that an implicit assump- 
tion of maximizing is present in our thinking about 
the interaction between reinforcement and behavior. 
It is an integral part of our conception of reinforce- 
ment that an animal engages in the more highly rein- 
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LEFT KEY 


RIGHT KEY 


RELATIVE NUMBER OF CHOICES 
RELATIVE PROBABILITY OF REINFORCEMENT 


TRIALS SINCE LAST CHOICE OF KEY 


Fig. 4. The two step functions plot changes in the relative 
momentary probability of reinforcement with the number of 
trials since the last choice of each key in a discrete-trial con- 
current VI VI schedule. The open and closed circles plot the 
relative number of responses on each key, averaged across two 
pigeons, as a function of the number of trials since the last 
choice of that key. The horizontal lines show the overall rela- 
tive reinforcement rates on the two VI schedules. (Herrnstein, 
unpublished data.) 


forced response when given a choice between two in- 
compatible responses that differ in reinforcement. 
Herrnstein and Loveland suggest that in a choice be- 
tween two VI schedules the animal adjusts its be- 
havior to equalize the ratio of responses to reinforce- 
ments on each alternative, which produces matching: 


Hy des 
7 
Ra -¥ 
Ry % 
R 
and = (6) 
R, + Ry 1417 


where R&, and R, stand for responses to each of the 
two alternatives and r,; and ry stand for the corre- 
sponding number of reinforcements. At any given 
moment, the animal performs that response which 
seems to have the most favorable response-to-rein- 
forcement ratio. By responding to that alternative it 


246 CHOICE IN CONCURRENT SCHEDULES AND A QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


drives its response-to-reinforcement ratio up toward 
that for the other alternative. When the other sched- 
ule offers the more favorable ratio, the animal 
switches to it. In choice between two unequal ratio 
schedules, where the experimenter fixes the response- 
to-reinforcement ratios, this maximization implies 
exclusive preference for the alternative with the 
smallest ratio of responses to reinforcements. In fact, 
Herrnstein (1958) and Herrnstein and Loveland (1975) 
have shown that on concurrent FR FR or VR VR 
schedules, pigeons will respond exclusively to the 
alternative with the shortest radio requirement, pro- 
vided that the difference between the two ratios is 
larger than some minimum value. 

This analysis of the role of maximizing in the 
matching relation leads to an interesting prediction 
for performance in concurrent VI VR schedules. Ac- 
cording to the matching relation, an animal’s re- 
sponse-to-reinforcement ratio on the VI schedule 
should come to equal the response-to-reinforcement 
ratio on the ratio schedule (see Equation 6). But 
should the VI schedule be short enough so that the 
obtained reinforcements per response for the VI sched- 
ule rise above the value fixed for the ratio schedule, 
responding to the ratio alternative should cease, since 
it no longer can offer the most favorable ratio of re- 
sponses to reinforcements. Herrnstein (1970, 1971) 
showed that pigeons do match response ratios to rein- 
forcement ratios on VI VR schedules over a wide 
range of schedule values. But once the VI was rich 
enough or the VR high enough so that the same rate 
of reinforcement could be obtained for fewer re- 
sponses on the VI, responding on the VR tended to 
cease; that is, relative response rate drifted toward 
exclusive preference for the VI schedule. 


TIME MATCHING AS THE FUNDAMENTAL 
MATCHING PROCESS 


Brownstein and Pliskoff (1968), Baum and Rachlin 
(1969), and more recently Rachlin (1973) have all 
argued that matching of relative time allocation to the 
relative frequency of reinforcement is more basic than 
response matching in concurrent schedules. They 
demonstrated time matching in situations in which 
there were no response requirements apart from the 
allocation of time to the particular stimuli associated 
with each schedule. Baum and Rachlin (1969) pro- 
posed that even a series of repetitions of discrete re- 
sponses such as key pecks or lever presses can be 
thought of as periods of time spent engaging in a con- 
tinuous activity (key pecking or lever pressing), be- 


cause these responses tend to be emitted in bursts of 
responses at a constant rate (Blough, 1963). The time 
spent pecking then determines the number of pecks 
emitted, and overall rate of responding over a session 
varies with the duration of pauses between the bursts. 

The time spent in each component of a CO-key 
concurrent schedule, or cumulated interchangeover 
time in a two-key procedure, cannot directly measure 
the time spent responding as Baum and Rachlin de- 
fine it, because it will always include time spent at 
other activities—e.g., grooming. These measures of 
time distribution between the two schedules will only 
be directly proportional to time spent responding if 
the proportions of time spent in other activities in 
each component were invariant across experimental 
conditions. Baum and Rachlin (1969) therefore sug- 
gest that the relative number of responses may pro- 
vide the best measure of the relative time spent re- 
sponding. As long as the time required for a response 
remains fairly constant, the number of responses will 
be directly proportional to time spent responding. 

But it then becomes difficult, if not impossible, to 
distinguish empirically between response matching 
and matching in terms of time spent responding. 
While there are choice situations in which an analysis 
in terms of number of responses would be arbitrary 
(Baum & Rachlin, 1969; Brownstein & Pliskoff, 1968), 
there are also situations in which an analysis in terms 
of time would be arbitrary. Nevin (1969) and Herrn- 
stein (unpublished data) both found response match- 
ing in discrete-trial procedures in which the pigeon 
had only occasional pecks at the key. | 

On the other hand, Rachlin (1973) argues that by 
allocating time to each schedule in a concurrent sched- 
ule procedure, the animal equalizes the local fre- 
quency of reinforcement for each alternative. Since 
local response rates are also equal on concurrent VI 
VI schedules (Catania, 1966; Killeen, 1972b; Shull & 
Pliskoff, 1967), overall response matching results from 
the matching of time allocation. But time allocation 
in the sense of equalizing local reinforcement fre- 
quencies refers to the total] time spent in each com- 
ponent (or to cumulated interchangeover time), not 
to time spent responding as defined by Baum and 
Rachlin (1969). Local response and reinforcement 
rates are calculated in terms of the time spent in each 
component, and the evidence cited by Rachlin (1973) 
in support of his theory shows time matching in this 
sense (Catania, 1966; Killeen, 1972b; Shull & Pliskoff, 
1967; Silberberg & Fantino, 1970). 

But Herrnstein (1970, 1971) reported that pigeons 
matched relative response rates, but not the relative 
cumulated interchangeover time, to the relative rein- 
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forcement frequency in a two-key concurrent VI VR 
schedule. In this situation, rate of reinforcement is 
essentially independent of rate of responding on the 
VI but is directly proportional to response rate on 
the VR.* Since local rates of pecking on VR schedules 
are faster than those on VI schedules for the same fre- 
quency of reinforcement, matching cannot hold for 
both relative rate of responding and relative cumu- 
lated interchangeover time. For each of the 10 pigeons 
studied by Herrnstein, matching described relative 
rate of responding over a wide range of different VI 
and VR values, although local response rates were 
faster on the VR. 

Herrnstein and Loveland (unpublished data) re- 
cently obtained similiar results in a CO-key concur- 
rent VI VR procedure. ‘The data are analyzed here in 
terms of Equations 4 and 5 (proportional ratio match- 
ing). All four pigeons in the study matched response 
ratios to obtained reinforcement ratios, but with a 
proportional bias toward the VR schedule; i.e., the 
pigeons responded a constant proportional amount 
more on the VR at all reinforcement ratios (except 
(0 or co). Figure 5 plots the logarithm of the response 
and time ratios against the logarithm of reinforce- 
ment ratios for each pigeon. The regression lines of 
best fit to these data and percentage of the variance 
accounted for are also shown. One pigeon matched 
both time and responses, but for three other birds the 
regression lines for time ratios are much flatter than 
those for response ratios. However, since the range of 
reinforcement ratios studied was narrow, the group 
data should also be examined, i.e., all the data points 
provided by the four pigeons. The least-squares regres- 
sion lines for these data are 1.03X — 0.15 for response 
ratios and 1.01X + 0.09 for time ratios. Thus while 


the individual pigeons showed better response match- 


ing, the slopes for the group data functions were close 
to 1.0 for both responses and time. 

La Bounty and Reynolds (1973) studied pigeons in 
a two-key concurrent FI FR schedule. This schedule 
has similiar properties to the VI VR schedule; local 
response rate is higher on the FR than the FI, and 
reinforcement rate on the FR is directly proportional 
to response rate. ‘he experimenters report approx- 
imate matching between relative response rate and 
relative reinforcement frequency for four of six 
pigeons, but state that no pigeons matched relative 


4 It should be noted that relative reinforcement frequency in 
a concurrent VI VR schedule is not strictly an independent 
variable since reinforcement rate on the VR schedule is de- 
pendent on response rate on the VR. But this is often the case 
with short VI schedules as well, and the matching relation holds 
between relative reinforcement rate and obtained, not scheduled, 
relative reinforcement frequency. 
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Fig. 5. The relation between log response ratio and log rein- 
forcement ratio, and between log time ratio and log reinforce- 
ment ratio, for four pigeons responding on CO-key concurrent 
VI VR schedules. The heavy diagonals represent unbiased 
matching; the fine diagonals represent the regression lines fitted 
to the data by the method of least-squares. The percentage of 
the data variance accounted for by each regression line is also 
given. (Herrnstein & Loveland, unpublished data.) 


time to relative reinforcement. They conclude that 
time matching is not more basic than response match- 
ing, since the latter can hold without the former. 
However, a different picture emerges when their data 
are reanalyzed in terms of proportional ratio match- 
ing, which accounts for bias toward one schedule. 
Time ratios in fact fit proportional ratio matching 
better than response ratios for four of the six pigeons. 
The slopes of the regression lines relating time 
ratios and reinforcement ratios for five of the pigeons 
were .88, .88, .99, .81, and .66: for response ratios they 
were .84, .79, .89, .75, and .86 for the same pigeons. All 
the pigeons showed a bias in time ratios toward the 
FI schedule, but the same bias was not found for re- 
sponse ratios. ‘The sixth pigeon was either indifferent 
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between the two schedules or else showed exclusive 
preference for the FR key. Thus in this study response 
matching was no better than time matching; both 
response and time distributions tended to somewhat 
undermatch the reinforcement distribution. A possible 
explanation of the undermatching found in concur- 
rent FI FR schedules will be considered later in the 
chapter (under the heading “Applications of Herrn- 
stein’s Equations to Other Schedules’’). 

To summarize, both response and time matching 
are found in concurrent schedules, and the best gen- 
eral conclusion at present is that the distribution of 
reinforcement in a concurrent schedule governs the 
distribution of behavior. Sometimes behavior is best 


measured in terms of the time allocated to each sched- 


ule, sometimes by the rate of responding. In some 
situations only one measure is appropriate (e.g., Baum 
& Rachlin, 1969; Brownstein & Pliskoff, 1968; Nevin, 
1969), but why one measure sometimes works better 
than the other when both are available (Hollard & 
Davison, 1971) remains to be determined. 


THE GENERALITY OF THE MATCHING 
RELATION 


As Catania (1966) indicated in the first substantial 
review of choice in concurrent schedules, the signif- 
icance of the matching relation depends on the range 
of conditions over which it occurs. We have already 
considered a substantial amount of evidence for 
matching to relative frequency of reinforcement in 
concurrent VI schedules, but does the matching rela- 
tion hold for other reinforcement parameters (e€.g., 
magnitude, immediacy, or quality of reinforcement): 
for different schedules besides the concurrent VI VI; 
or for negative reinforcement? 


Magnitude of Reinforcement 


The data on reinforcement magnitude are equiv- 
ocal, both matching and undermatching being re- 
ported in studies of concurrent schedules. Catania 
(1963a) found that two of his three pigeons matched 
their distribution of responses to the relative dura- 
tions of grain reinforcement provided by two equal 
VI schedules. But he examined only one relative rein- 
forcement duration besides equality. Brownstein 
(1971) investigated choice between two pairs of un- 
equal reinforcement durations in addition to equality 
in the time allocation procedure used by Brownstein 
and Pliskoff (1968). Three pigeons chose between two 
equal VI schedules of response-independent grain pres- 


entation by pecking at a CO key. Different COD 
durations (2, 5, and 7 sec) were scheduled for each 
bird. The slopes of the regression lines relating time 
ratios to reinforcement duration ratios were .70, 1.20, 
and .95 for the three pigeons. The pigeon for which 
the shortest COD was programmed (2 sec) was under- 
matching, and the pigeon with the 5-sec COD was 
overmatching. ‘The regression line for the group data 
was .95X + .02. De Villiers and Millenson (1972) also 
reported response matching to relative duration of 
condensed milk reinforcement for three rats, but they 
only examined one relative duration (.75). 

In a two-key concurrent schedule, Fantino, Squires, 
Delbruck, and Peterson (1972) varied the overall fre- 
quency of grain reinforcements while keeping the rela- 
tive duration constant at .80 (1.5 and 6 sec). Relative 
response rates and the relative total reinforcer time on 
the 6-sec key increased as reinforcement became more 
frequent, because the pigeons spent more time re- 
sponding on that key. Fantino et al. analyzed their 
data in terms of relative response rates and reported 
that the pigeons failed to match either relative hopper 
duration or relative total reinforcer time (hopper 
duration times the number of reinforcements on that 
key). But if the data are reanalyzed in terms of pro- 
portional ratio matching (Equation 5; repeated here 
for the reader’s convenience), 


a (5) 


which allows for bias toward one key, then matching 
is found (Baum, 1974a). A regression line of slope 1.08 
and a k of 2.4 accurately describe the ratio data. The 
pigeons of Fantino et al. were therefore matching 
total reinforcer time, but with a constant proportional 
bias toward the 1.5-sec key. Such a systematic bias at 
all frequencies of reinforcement could arise if the time 
spent eating on the 6-sec reinforcement was in fact less 
than 4 times the 1.5-sec reinforcement. 

The above results support response or time match- 
ing to reinforcement duration, but several other 
studies have failed to find matching to either relative 
duration or total reinforcer time. Walker, Schnelle, 
and Hurwitz (1970), using rats as subjects, varied the 
duration of access to sucrose solution and reported 
only poor matching between relative response rates 
and relative reinforcer durations. In a subsequent 
experiment, Walker and Hurwitz (1971) also varied 
the access to sucrose, but scheduled reinforcements 
by a single VI schedule (Stubbs & Pliskoff, 1969), there- 
by insuring that reinforcement frequency did not co- 
vary with reinforcement duration. They, too, found 
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that preference for an alternative was less extreme 
than would be predicted by matching to the relative 
duration of reinforcement. However, in both these 
studies the COD was only 2 or 3 sec, too short for 
matching in concurrent schedules with rats. Moreover, 
the COD was programmed from the last response on a 
lever before a CO (Findley, 1958), making it func- 
tionally even shorter than the usual COD (Catania, 
1966). Shull and Pliskoff (1967) and de Villiers and 
Millenson (1972) found that CODs longer than 5 sec 
were necessary before rats matched either frequency 
or duration of reinforcement. 

Todorov (1973) studied concurrent schedules in 
which frequency and duration of reinforcement co- 
varied. The pigeons in his experiment considerably 
undermatched the relative total reinforcer time, and 
frequency of reinforcement affected choice more than 
did duration. But this experiment used an extremely 
complex procedure. Three different pairs of VI sched- 
ules were programmed on the main key in randomized 
combinations, each pair in operation for 20 of the 60 
reinforcements per session. The durations of rein- 
forcement for the three different VI schedules were 
then varied in blocks of sessions. ‘The procedure was 
therefore a multiple concurrent schedule, and it is 
difficult to assess the reinforcement interactions that 
such a schedule might produce. (Interactions in 
multiple schedules are discussed later in the chapter). 
Furthermore, ‘Todorov used a l-sec time-out con- 
tingent on a CO response in place of a COD. Whether 
the time-out is functionally equivalent to a COD 
(see p. 244) or whether it separated the schedules 
sufficiently is uncertain. The flat function relating 
relative response rate to relative total reinforcer time 
observed by ‘Todorov is typical of the function found 
in the absence of a COD, so it is possible that the 
pigeons never fully distinguished the three different 
component schedules. However, Todorov did find that 
frequency of reinforcement had a greater effect on 
relative response rate than did reinforcement dura- 
tion. ‘This could represent either a problem of dis- 
crimination or a fundamental difference in the rela- 
tionship between response strength and _ various 
reinforcement parameters. Mariner and ‘Thomas (1969) 
argued that pigeons have difficulty discriminating be- 
tween different hopper durations because they cannot 
tell that they are in the longer duration until the time 
of the shorter duration has passed. Mariner and 
Thomas failed to obtain peak shift> (Hanson, 1959) 

5 The peak shift is a displacement of the peak of a post- 
discrimination generalization gradient away from the stimulus 
associated with the higher frequency of reinforcement (S*) in 


the direction away from the stimulus associated with extinction 
or the lower frequency of reinforcement (S-). 
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from pigeons in a wavelength discrimination with dif- 
ferent hopper durations unless the two durations were 
signaled by hopper lights of different intensity. ‘The 
peak shift is readily obtained when reinforcement fre- 
quency varies in a discrimination procedure (Dysart, 
Marx, McLean, & Nelson, 1974; Guttman, 1959). Thus 
pigeons might match relative reinforcement durations 
more closely if differential signals were provided for 
the hopper durations. 

Schneider (1973) avoided a difficulty common to all 
these studies—that of determining whether the actual 
relative quantity of reinforcer obtained by the subjects 
equals that scheduled by the timers for hopper dura- 
tion. He varied the number of food pellets provided 
during each reinforcement in a two-key concurrent 
schedule. A single VI schedule arranged the rein- 
forcements, which were assigned with an equal prob- 
ability to each key. A 1.5-sec GOD operated through- 
out the experiment. Schneider noted that a fixed 
quantity of reinforcement on a key maintained more 
rapid responding when it was delivered frequently in 
small amounts than when it was delivered infre- 
quently in large amounts. When _ reinforcement 
frequency remained constant the pigeons under- 
matched the relative quantity of reinforcement at 
three different pellet ratios (1:1, 1:3, and 1:7). Slopes 
of the regression lines relating log response ratios to 
log pellet ratios were .19, .41, .43, and .58 for the four 
pigeons. However, when the number of pellets per 
reinforcement remained constant, undermatching was 
also obtained for the same ratios of reinforcement 
frequency (slopes of .35, .46, .64, and .63). In view of 
the excellent matching reported by Stubbs and 
Pliskoff (1969) for frequency using the same sched- 
uling procedure and a 2-sec COD, Schneider’s result 
is surprising. The 1.5-sec COD was possibly too short 
for his pigeons, and it is unfortunate that he did not 
also study choice at longer COD durations. 

Unpublished data from an experiment by de Vil- 
liers and Balboni suggest another reason why Schnei- 
der did not find matching to relative quantity of food. 
De Villiers and Balboni studied two rats responding 
on a two-lever concurrent VI schedule for different 
numbers of food pellets. A single VI 30-sec schedule 
arranged reinforcements which were assigned with 
equal probability (.50) to each lever. With one pellet 
of food for responding on the left lever and five pellets 
on the right, the COD was systematically increased 
from 5.5 through 15 sec. Relative response rate on the 
five-pellet lever increased between 5.5-sec and 7.5-sec 
COD duration but declined with further lengthening 
of the COD. Changeover rate decreased with increas- 
ing COD length. As the lower panel of Figure 6 indi- 
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Fig. 6. (Top:) The mean log response ratios (left panel) and 
mean log time ratios (right panel) of two rats responding on a 
single-tape concurrent VI schedule plotted as a function of log 
pellet ratio N,;/N,. The open symbols represent data from the 
condition in which relative frequency of reinforcement was kept 
constant at .50 and the relative number of food pellets for each 
alternative varied: the filled symbols represent the condition 
in which number of pellets per reinforcement was the same 
for both levers and relative frequency of reinforcement varied. 
The least-squares regression lines and the percentage of the 
data variance that they account for are given in the upper 
lefthand corner of each panel for frequency and the lower 
right-hand corner for number of pellets. (Bottom:) The bar 
graphs show the deviation from matching to relative pellet 
number (.83) (relative response rate minus relative number of 
pellets) for each rat as the GOD duration varied. (From de 
Villiers & Balboni, unpublished data.) 


cates, the smallest deviation from matching to the 
rélative pellet number of .83 occurred at the 7 9-SEC 
COD, and matching to relative pellet number was not 
observed at any GOD value. 

In the second condition of the experiment the 
COD was set at 7.5 sec and the number of food pellets 
for responding on each lever varied through the fol- 
lowing sequence: 1:5, 3:3, 4:2, and 5:1. The VI 30-sec 
schedule continued to assign reinforcements with an 
equal probability to the two levers. The open symbols 
in the upper panels of Figure 6 show the mean log 
response ratios and mean log time ratios (cumulated 
interchangeover time) of the two rats plotted against 
the mean log pellet ratios. The least-squares regression 
lines fitted to the data were .45X + .03 (responses) 
and .53X — .26 (time). Slopes of the individual func- 
tions were .52 and .38 for responses, .62 and .46 for 
time ratios. 

The upper panels of Figure 6 also give the mean 
response and time ratio data for the third condition of 
the experiment. With the same COD duration and 


three pellets for responding on each lever, the prob- 
ability of reinforcement assignment to each lever was 
varied through the sequence: .50:.50, .15:.85, and 
.75:.25. Regression lines fitted to the mean data from 
the two rats were .89X — .01 (responses) and .94X — 
.09 (time): The slopes of the individual functions were 
-91 and .86 for responses, and .77 and 1.11 for time 
ratios. One rat therefore matched responses and the 
other rat approximately matched time allocation to 
the ratios of the obtained frequencies of reinforce- 
ment. 

The results suggest that matching to relative mag- 
nitude of reinforcement does not hold in the Stubbs 
and Pliskoff (1969) single-tape scheduling procedure 
regardless of COD length and changeover rate. The 
single-tape procedure forces the animal to respond for 
some time on the alternative with the smaller rein- 
forcer since half the reinforcement opportunities are 
scheduled for that side. Reinforcements assigned to 
that alternative must be obtained before the VI tape 
Starts timing intervals again for either alternative. In- 
creasing the COD reduces changeovers and separates 
the two alternatives in time, but it forces the animal 
to spend even longer responding on the alternative 
with the small reinforcer in order to restart the tape. 
Since the Stubbs and Pliskoff procedure, which varies 
relative frequency of reinforcement, does not force the 
animal toward the less frequently reinforced alterna- 
tive to the same degree, matching is found at suitably 
long COD durations. 

A recent experiment on magnitude of reinforce- 
ment by Iglauer and Woods (1974) used indepen- 
dently programmed VI 1-min schedules, thus avoiding 
the constraints on responding inherent in the single- 
tape procedure. They varied the dosage of intravenous 
cocaine injections available on each schedule, a 
parameter of reinforcement magnitude more imme- 
diately discriminable than duration. Drug dosage was 
manipulated by varying the volume of a constant- 
concentration cocaine solution injected over a con- 
stant time period. Reinforcers therefore differed in 
volume but not in concentration or duration. The full 
procedure was a modification of the standard two- 
lever concurrent schedule in that a single response on 
a center lever initiated the concurrent schedule on 
two adjacent levers. Responding on one lever pro- 
duced a constant dose of .l-mg/kg/injection, while the 
dosage of cocaine associated with the other lever 
varied from .025 to .4 mg/kg/injection. During rein- 
forcement, one of two pumps injected the cocaine 
solution for 35 sec, followed by a 5-min blackout of 
the chamber for the drug to take effect. At the end of 
the blackout a single response on the center lever 
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again initiated the choice procedure. A 1.5-sec COD 
was in effect during the concurrent schedule com- 
ponent. 

Since relative reinforcement frequency covaries to 
some extent with relative response rate in the two- 
tape procedure, Iglauer and Woods calculated the 
relative drug intake on the two levers. Drug intake 
represents the number of reinforcements received on a 
lever multiplied by the drug dose available on it. For 
both monkeys on the concurrent schedule, relative 
response rates matched relative drug intake over a 
wide range of values. Regression lines relating log 
response ratio to log drug intake ratio (excluding 
points of exclusive preference) were 1.08X — .06 and 
1.11X + .01 for the two monkeys. 

Two other monkeys responded in a concurrent 
chain schedule procedure (Autor, 1969; Herrnstein, 
1964), in which responding during an initial two- 
lever concurrent VI VI schedule (choice link) led to 
one of two equal-valued single-lever FR schedules for 
cocaine (terminal links). These monkeys matched rela- 
tive response rates during the choice link to their 
relative drug intake in the terminal links. 

On a whole, these studies suggest that frequency of 
reinforcement may have a greater effect than magni- 
tude on choice in concurrent schedules with short 
CODs, but the matching relation applies to total rein- 
forcement received (amount times frequency) when 
appropriate concurrent schedule procedures are em- 


ployed. 


Immediacy of Reinforcement 


The application of the matching relation was ex- 
tended to immediacy of reinforcement (1/delay) by 
Chung and Herrnstein (1967). For four pigeons, VI 
reinforcement on one key (the standard key) was de- 
layed for 8 sec, while on the other key (the experi- 
mental key) the delay of reinforcement on an equal VI 
schedule varied from 1 to 30 sec. For two other 
pigeons the standard key delay was 16 sec. During the 
delay of reinforcement the experimental chamber was 
blacked out. The slopes of the least-squares regression 
lines relating ratios of responses to ratios of imme- 
diacy of reinforcement on the two keys were .92 for 
the 8-sec standard delay group and 1.05 for the 16-sec 
group. The 16-sec pigeons showed a consistent bias 
toward the experimental key, but the slopes of both 
functions were close to perfect matching, In an earlier 
experiment, Chung (1965) had studied choice between 
immediate and delayed reinforcement in equal con- 
current VI schedules. Chung and Herrnstein (1967) 
demonstrated that if a small constant (1.6 sec) was 
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taken as the actual delay interval for what was nom- 
inally immediate reinforcement, Chung’s pigeons were 
actually matching the relative reciprocal of delay 
on each key. ‘The constant represents the time taken 
for the pigeon to lower its head to the feeder and 
begin eating. 

On the other hand, Shimp (1969b, Experiment IT) 
found that if pigeons in a similar choice procedure 
had to peck after the delay blackout to obtain rein- 
forcement, two of the three birds still matched relative 
response rates to the relative reciprocal of blackout 
duration. Since there was now no time interval im- 
posed between the last peck and food presentation, 
Shimp argued that the blackout and not the delay of 
reinforcement was crucial in Chung and Herrnstein’s 
experiment. Shimp’s procedure is much like requiring 
a lengthy prereinforcement interresponse time (IRT) 
stipulated by the blackout duration. Shimp (1969b, 
Experiment I) and Moffitt and Shimp (1971) demon- 
strated that in concurrent VI schedules in which 
different IR'T’s are required for reinforcement on the 
two schedules, pigeons match response rates to the 
relative reciprocal of IRT duration. 

But experiments by Herbert (1970) do not support 
Shimp’s interpretation of Chung and Herrnstein’s re- 
sults. Herbert used a single VI schedule and assigned 
the reinforcements with an equal probability to the 
two keys. In the first experiment, reinforcement for 
each key was delayed by a blackout, as in Chung and 
Herrnstein’s study. Two relative delay values besides 
equality were examined. The slopes of regression lines 
relating ratios of responses to ratios of immediacy of 
reinforcement were .86, .73, and .86 for the three 
pigeons. The short 1-sec COD or the single-tape pro- 
cedure (see p. 249) could account for the under- 
matching observed, especially for one bird. In a 
second experiment with the same pigeons, Herbert 
repeated Shimp’s procedure requiring a response be- 
fore food presentation after the blackouts. Only one 
of the three pigeons continued to approximate match- 
ing to the relative reciprocal of blackout duration; the 
other two were much closer to indifference at three 
different relative blackout durations. Matching was 
therefore considerably impaired by the added response 
requirement in this study. 

In Herbert’s third experiment reinforcement on 
one key was immediate, whereas it followed a black- 
out delay on the other key (as in Chung’s 1965 pro- 
cedure). But an equal number and duration of re- 
sponse-contingent blackouts were also programmed at 
variable intervals on the immediate key, indepen- 
dently of the reinforcement. Under these conditions, 
no pigeon matched the relative immediacy of rein- 
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forcement when a small constant was taken as the 
actual delay on the immediate key. When the relative 
frequency of reinforcement was fixed at .50, relative 
response rate on the key with immediate reinforce- 
ment increased as a linear function of increasing de- 
lay on the other key as opposed to the exponential 
function obtained by Chung (1965). On the basis of 
these results, Herbert questions Chung and Herrn- 
stein’s interpretation of their findings in terms of 
matching to relative immediacy of reinforcement. 
However, the pigeons in Herbert’s third experiment 
did not choose between immediate and delayed rein- 
forcement as was suggested; they actually chose be- 
tween immediate reinforcement plus punishment by 
response-contingent time-out from food (blackout) 
and delayed reinforcement. The relation between this 
procedure and choice between two different delays 
of reinforcement (Chung & Herrnstein, 1967; Herbert, 
1970, Experiment I), for which approximate match- 
ing is found, is unclear. The effects of punishment on 
choice in concurrent schedules is considered later in 
the chapter. 

Nevertheless, the possibility remains that the 
pigeon’s choice in Chung and Herrnstein’s experiment 
was influenced more by rate of reinforcement than by 
the delay per se. ‘Their procedure is formally similar 
to a concurrent chain schedule in which responding 
in an initial concurrent VI VI schedule leads to un- 
equal-length terminal links (blackouts) ending with 
noncontingent food—ie., different fixed-time (FT) 
schedules. The shorter terminal link (i.e., the shorter 
delay period) has the higher rate of reinforcement, 
and the relative frequency of reinforcement equals the 
relative reciprocal of blackout duration. Neuringer 
(1969) demonstrated that pigeons are indifferent be- 
tween FT and FI terminal links in a concurrent chain 
schedule. This suggests that relative response rates in 
the choice link of a concurrent chain schedule should 
show the same functional relation to relative terminal 
link reinforcement frequency whether the terminal 
links are FT schedules (as in Chung and Herrnstein’s 
study) or FI schedules. But Duncan and Fantino 
(1970) found greater preference for the shorter of two 
FI terminal links than was predicted by matching to 
the obtained relative frequency of reinforcement, al- 
though they investigated a range of FI durations 
similar to the delay duration employed by Chung and 
Herrnstein. Neuringer (1969) himself obtained under- 
matching between relative response rates in the choice 
link and the relative reciprocal of differing FI and FT 
terminal link durations. ‘Thus the role of rate of rein- 
forcement as opposed to delay in Chung and Herrn- 


stein’s results remains uncertain. (See Fantino, chapter 
11 in this volume.) 


Qualitatively Different Reinforcers 


In all the above experiments the physical dimen- 
sions of a given reinforcer were varied. But what if 
the reinforcers differ in quality? In this case the sub- 
ject may prefer one of the reinforcers over the other, 
even at equal frequencies of reinforcement. Hollard 
and Davison (1971) studied pigeons under two-key 
concurrent VI schedules with food as the reinforcer 
on one key and ectostriatal brain stimulation as the 
reinforcer on the other. A single VI schedule assigned 
reinforcements according to different proportions to 
the two keys (Stubbs & Pliskoff, 1969). The brain 
stimulation parameters were kept constant while the 
frequency of food reinforcement was varied. Although 
all the pigeons showed a constant proportional prefer- 
ence for the food at all frequencies of reinforcement, 
proportional ratio matching accurately described the 
relation between ratios of time spent responding for 
food or brain stimulation and ratios of the number of 
reinforcements of each kind. Individual regression 
lines relating the log of the time ratios to the log of 
the reinforcement ratios were: Y = 1.05X + .74; 
Y = 1.01X + .27; and Y = .98X + .78. On the other 
hand, all three pigeons produced regression lines re- 
lating response ratios to reinforcement ratios with a 
flatter slope than that predicted by matching, the 
individual Slopes being .79, .65, and .83. Biased time 
matching but not response matching was therefore 
found for the different qualities of reinforcement. 

Brown and Herrnstein (1975) have pointed out that 
the matching relation might not hold where the 
different reinforcers interact, as food and water do 
(Bolles, 1967). Since animals tend to drink following 
food consumption (Bolles, 1967), increasing the fre- 
quency of food reinforcement might enhance the 
value of a constant frequency of water reinforcement. 
On the other hand, Wood, Martinez, and Willis 
(1975) found that increasing the FR requirement in a 
concurrent FI (food) FR (food or water) schedule had 
different effects depending on whether the FR rein- 
forcement was food or water. When both reinforcers 
were food, increasing the FR also increased the 
animal’s responding on the FI; but when the FR 
reinforcer was water, responding on the FI for food 
was unaffected by changes in the FR. Systematic varia- 
tion of the frequency of food and water reinforcers in 
concurrent schedules is necessary to determine the 
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form of the choice function and the nature of any in- 
teraction between the reinforcers. 


Punishment and Choice in Concurrent Schedules 


Few studies have quantified the effects of punish- 
ment on choice in concurrent schedules. Holz (1968) 
punished concurrent responses maintained by differ- 
ent frequencies of reinforcement in a two-key concur- 
rent VI l-min VI 4-min schedule. Each response on 
the two keys was punished by a brief electric shock. 
As the shock intensity increased, response rate on both 
keys progressively decreased. Nevertheless, at all shock 
intensities, as long as the pigeons continued to re- 
spond, the proportion of responses to each key 
matched the proportion of reinforcements obtained 
from that key. In Holz’s procedure every response 
was punished, so the proportion of punishments 
equaled both the proportion of responses and the 
proportion of reinforcements on the two keys as long 
as the pigeon matched. If the shocks had been pro- 
grammed so that this proportionality no longer held— 
e.g., by equal VI or FI punishment schedules—some 
interaction between the positive reinforcement and 
the punishment might be expected, and the simple 
matching relation might not hold. Azrin and Holz 
(1966) reported that when only one of the concurrent 
responses was punishment, the alternative response 
increased in frequency while the punished response 
was rapidly suppressed. However, no systematic study 
has yet quantified the interactions between punishing 
and rewarding consequences of responding in a con- 
current schedule paradigm (cf. Rachlin & Herrnstein, 
1969). 

As discussed earlier, Todorov (1971) punished the 
CO response in a CO-key concurrent VI schedule with 
electric shock or time-out from reinforcement. Rela- 
tive response rate and relative time allocation for the 
preferred schedule increased sharply as the shock in- 
tensity was increased or the time-out duration length- 
ened, but the relative reinforcement frequency did 
not also increase. The relative response and time 
measures therefore deviated more and more from 
matching in the direction of the preferred schedule 
as the punishment increased. This deviation from 
matching can be explained by the differences between 
the Holz (1968) and ‘Todorov procedures. In the 
simple continuous punishment procedures used by 
Holz, each choice is punished in proportion to the 
number of responses made to it. However, in 
Todorov’s procedure, each choice receives the same 
number of punishments per session, since the COs in 
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both directions must be equal. Thus the less preferred 
choice receives disproportionately more punishment 
than the more preferred and is more suppressed. 

De Villiers and Millenson (1972) also studied the 
interaction between positive reinforcement and aver- 
sive stimulation in a choice procedure. They superim- 
posed a conditioned aversive stimulus (a conditioned 
suppression procedure—Estes and Skinner, 1941) on 
a two-lever concurrent VI schedule with different 
durations of reinforcement programmed for each re- 
sponse. They suggested that the aversiveness of the 
conditioned suppression procedure summated with 
the reinforcement for the two responses, subtracting a 
constant value from each. If this were the case, the 
relative reinforcing value of the preferred schedule 
would increase, 


(Yr; — C) 4 
(,-O+(m 0 4 4+% (7) 


where 7; > 7%, and r, ~ 0, and so should the relative 
performance measures if matching to relative value 
were retained. In fact, de Villiers and Millenson 
found increased preference for the lever associated 
with the bigger reinforcement during the preshock 
stimulus. A similar interaction between the food rein- 
forcement and the aversive value of the punishment 
could account for the increased preference for the 
more favored key in the ‘Todorov (1971) study. 

This interpretation of Todorov’s results receives 
support from a recent experiment by de Villiers (un- 
published data). Three pigeons responding on a two- 
key concurrent VI l-min VI 3-min schedule for grain 
received intermittent punishment with brief electric 
shock for pecks at each key. Punishments were ar- 
ranged by a single VI 15 sec schedule and assigned 
with an equal probability (.50) to each key, thus main- 
taining the relative frequency of punishment at .50. A 
3-sec COD was programmed for food but not for 
punishment throughout the experiment. In the ab- 
sence of punishment the pigeons closely matched rela- 
tive response rates and relative time allocation 
(cumulated interchangeover time) to the relative fre- 
quency of reinforcement of .75. But with increasing 
intensity of punishment, the relative response and 
time distributions deviated more and more from 
matching toward the key with more frequent rein- 
forcement. Overall response rate on the VI 3-min key 
was much more suppressed by punishment than that 
on the VI l-min key. Figure 7 depicts the results from 
each pigeon. 

Further systematic investigation of the effects of 
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Fig. 7. Deviation of relative response rates (top panels) and 
relative time distribution (center panels) of three pigeons from 
relative frequency of reinforcement (relative frequency of re- 
sponding or relative time minus relative frequency of rein- 
forcement) on a concurrent VI 1-min VI 3-min schedule as a 
function of punishment intensity. Relative frequency of pun- 
ishment was the same on each key. Shown in the lower panels 
are the suppression ratios for each key, calculated in terms of 
B/(A +B), where B is response rate on a Key in a particular 
punishment condition and A is response rate on the same key 
in the base line condition (no punishment). (From de Villiers, 
unpublished data.) 


different punishment or conditioned suppression 
parameters on a wider range of relative reinforcement 
frequencies is needed to determine the function that 
relates the positive and negative consequences of 
choice. The sensitivity of concurrent performances to 
both positive reinforcement and punishment param- 
eters (Catania, 1966; Holz, 1968) indicates that this 
paradigm could be very useful for study of the inter- 
actions between positive and aversive control of 
behavior. 


Negative Reinforcement 


Baum (1973) used the same shuttlebox situation as 
Baum and Rachlin (1969) and four of the pigeons 
from that study in a concurrent VI escape procedure. 
Standing on either side of the chamber was reinforced 


on two different VI schedules by a 2-min time-out 
from a train of shocks. There were sizable hysteresis 
effects, but proportional ratio matching, Equation 4, 
provided a good fit to the data for two of the four 
pigeons; i.e., those pigeons matched the ratio of time 
spent on each side of the chamber to the ratio of the 
frequencies of time-out provided on that side. The 
data from the other two pigeons deviated from the 
matching relation in opposite directions. For one 
pigeon, a regression line of greater slope than 1.0 
fitted the time and reinforcement ratios, but this 
pigeon died after only the ascending series of schedule 
values. ‘The data could therefore be affected by the 
strong order effects found in this experiment. The 
second pigeon undermatched the ratios of time-out 
frequencies, and Baum suggests that this may be due 
to the short COD used in the study (1 sec). This bird 
showed a constant high rate of COs at all schedule 
values and also undermatched ratios of positive rein- 
forcement in Baum and Rachlin’s earlier study when 
the COD was as long as 4.25 sec. But the equivalence 
of COD values for positive and negative reinforce- 
ment in concurrent schedules remains to be demon- 
strated, so this explanation is tentative. 

Despite the difficulties involved in working with 
aversive contingencies, Baum’s results suggest that sub- 
sequent research on concurrent schedules of negative 
reinforcement could be extremely fruitful. The match- 
ing relationship may provide a means of integrating 
both positive and negative reinforcement into the 
same conceptual framework (see also pp. 260 and 


270). 


Different Schedules of Reinforcement 


Earlier in the chapter it was shown that the match- 
ing relation is readily extended to concurrent VI VR 
schedules (Herrnstein, 1970, 1971) and to concurrent 
FI FR schedules (La Bounty & Reynolds, 1973), 
though in the latter case the subjects tended to under- 
match, In concurrent ratio schedules, on the other 
hand, matching cannot occur except trivially, since a 
ratio schedule fixes a proportionality between num- 
bers of responses and numbers of reinforcements. In 
fact, for most pairs of ratio values on concurrent VR 
VR or FR FR schedules the subjects maximize rein- 
forcements per response—i.e., respond exclusively to 
the alternative with the shortest ratio requirement 
(Herrnstein, 1964; Herrnstein & Loveland, 1975). 
The relation between maximizing and matching was 
discussed in an earlier section. 

What about choice between concurrent interval 
schedules other than the concurrent VI VI? Both 
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Nevin (1971) and Trevitt, Davison, and Williams 
(1972) investigated choice in two-key concurrent VI FI 
schedules. Nevin reported that relative frequency of 
responding on the FI key depended on the relative 
frequency of reinforcement on that schedule, but did 
not match it. Instead, the ratio of responses on the FI 
to responses on the VI was a power function of the 
ratio of reinforcements for each schedule, with an 
exponent of approximately .50: 


ea (8 
Ry or? 


Two of his pigeons tended to favor the VI key while 
one favored the FI key. 

The power function is more general than the 
matching relation described by proportional ratio 
matching (Equation 5). While proportional ratio 
matching specifies that the exponent shown in Equa- 
tion 8 should be 1.0, the power function formulation 
allows the exponent to vary on both sides of 1.0. Thus 
it can account for both undermatching (slope <1.0) 
and overmatching (slope >1.0) to the reinforcement 
ratios. It should be noted that a suitable rescaling of 
the reinforcement variable—for example, by taking the 
square root of reinforcement frequency in the case of 
Equation 8—would give proportional ratio matching. 
‘The question of rescaling the reinforcement variable 
to give proportional ratio matching is discussed in 
more detail on pp. 275 ff. 

Trevitt et al. (1972) also found a power function 
relation between the ratio of responses or time spent 
responding on each of the schedules and the ratio of 
reinforcements. All of their birds showed a consistent 
preference for the VI schedule. The exponents fitted 
to the Trevitt et al. data vary between .38 and .75 for 
response ratios, with an exponent of .62 for the aver- 
aged data. Similar slopes were found for each pigeon 
in control VI VI conditions, but the preference for 
the VI key was less marked. Comparing the VI FI and 
VI VI functions, Trevitt et al. therefore concluded 
that there was a constant proportional preference for 
the VI schedule in the VI FI choice. As discussed 
earlier, however, the VI VI data from this study are 
somewhat equivocal because of the long preexposure 
to the VI FI conditions during which the FI schedule 
was always associated with the same key; so the con- 
stant proportional preference for the VI may not be a 
general finding. 

Nevertheless, both studies indicate that matching 
of response or time ratios does not occur in concur- 
rent VI FI schedules. ‘These ratios are related by a 
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power function in choice between VI and FI 
schedules. 

A study of choice in CO-key concurrent FI FI 
schedules (White & Davison, 1973) indicates that the 
pattern of responding is crucial in determining 
whether or not the matching relation holds in con- 
current interval schedules. When typical VI perfor- 
mance was found on both FI schedules—t.e., a fairly 
constant response rate between reinforcements on 
each schedule—matching between response ratios and 
reinforcement ratios occurred. Similarly, when typical 
FI performance was found on both schedules—i.e., a 
postreinforcement pause followed by a rapidly accel- 
erating response rate on each schedule—matching was 
also observed. However, when differing response pat- 
terns were generated by the two schedules, VI re- 
sponding on the shorter FI schedule and FI respond- 
ing on the longer schedule, a power function with an 
exponent close to .50 (individual exponents ranging 
from .38 to .57) related the response and reinforce- 
ment ratios on the two schedules. ‘This exponent is 
the same as that found by Nevin (1971) for concurrent 
VI FI schedules. 


Choice Between Interresponse Times 


The pattern of responding on concurrent schedules 
can also be altered by reinforcing only particular 
bands of interresponse times (IRTs). In experiments 
by Staddon (1968) and Shimp (1968, 1969b) two classes 
of IRTs on a single key were differentially reinforced. 
The pigeons therefore chose between emitting two 
different IRTs on the same key, each associated with a 
different frequency of reinforcement. In Shimp’s 
experiments, but not in that of Staddon, the two 
IRT bands were signaled by discriminative stimuli on 
the key. 

It is difficult to characterize the reinforced operant 
in these paced schedules, since a large amount of col- 
lateral behavior is generated by such _ schedules 
(Kramer & Rilling, 1970). Nevertheless, orderly rela- 
tions were obtained between response rates and rein- 
forcement frequencies. Shimp (1968, 1969b) demon- 
strated that the relative frequency of each IRT 
approximately matched the relative reciprocal of its 
length, and the relative rate at which each of the two 
IRTs was emitted was a monotonically increasing 
function of its relative frequency of reinforcement. 
Staddon (1968) divided all of the responses made by 
his subjects into two component distributions under 
the control of the short and long IRT contingencies, 
respectively. ‘The pigeons’ allocation of responses to 
each of these distributions (response ratios derived 
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from the component distributions) was related to the 
ratios of reinforcement of the two IRTs by a power 
function with an exponent of approximately .67 (a 
range of .59 to .77 for individual birds). 

Mofhtt and Shimp (1971) used a two-key concurrent 
procedure in which one class of IRTs was reinforced 
on one key and a second class of IRTs was reinforced 
on the other key. A single VI programmer arranged 
reinforcements for both keys, and these were assigned 
to each key according to different probabilities. In one 
experiment, the relative reinforcement frequencies 
were equal, but the lengths of the two reinforced 
IR'T's varied. The relative rate of responding on a key 
approximately equaled the relative reciprocal of the 
length of the IRT reinforced on that key. In a second 
experiment, the relative reinforcement frequency for 
two particular IRs, one on each key, was varied. 
The reinforced IRTs for one group of pigeons were 
the same as those used by Shimp (1968), 1.5 to 2.5 sec 
and 3.5 to 4.5 sec. For a second group of pigeons the 
reinforced IRT’s were those used by Staddon (1968), 
2 to 3 sec and 10 to I] sec. 

Monotonically increasing, negatively accelerated 
functions described the relation between relative re- 
sponse rate and relative reinforcement frequency for 
each key, similar to the functions obtained for the 
same IR'T's by Shimp (1968) and Staddon (1968). If 
the data are plotted in terms of ratios of responses 
and reinforcements, the group of three pigeons ex- 
posed to the IRTs used by Staddon (1968) produce 
power functions with exponents of .73, .58, and .69, 
respectively. The group of pigeons exposed to the 
IR'Ts taken from Shimp’s earlier experiment produce 
power functions with larger exponents, .91, .86, and 
-63, respectively, nearer the 1.0 exponent predicted 
by the biased matching relation, All of the pigeons 
showed a marked bias toward the shorter of the two 
IRTs, in keeping with the preference for shorter 
IR'T’s suggested by Shimp (1968). Choice between two 
concurrent IR'I's therefore produces similar functions 
relating response ratios (or proportions) to IRT 
lengths and to reinforcement ratios (or proportions) 
whether the IRTs are programmed on two keys or 
together on one key. 

The results of these experiments demonstrate that 
the closer together the two reinforced IRTs are in 
length, the higher the power function exponent and 
the closer the approximation to matching. Indeed, 
Shimp (1971) showed that when the same two bands 
of IRTs are reinforced on each key in a two-key con- 
current schedule, close matching occurs between rela- 
tive response rate on one key and relative rein- 
forcement rate on that key. The matching relation 


therefore applies even in situations in which the pat- 
tern of responding on each key is constrained by 
paced schedules, provided that the response constraints 
are the same on the two keys. The more the constraints 
deviate from equality, the more relative response rates 
or response ratios deviate from matching. 

Other experiments by Shimp (1970) and Hawkes 
and Shimp (1974) have established some of the bound- 
ary conditions for matching to the relative reciprocal 
of the reinforced IRTs. When two different signaled 
IRT bands are reinforced equally frequently on a 
single key, preference for the shorter of the IRTs in- 
creases from near indifference at very low overall rates 
of reinforcement until it reaches an asymptote ap- 
proximating the matching-to-relative-reciprocal value 
between 20 and 30 reinforcements per hr (Shimp, 
1970). Hawkes and Shimp (1974) concurrently rein- 
forced two signaled IRT bands on a single key with 
equal frequencies. They maintained the relative re- 
ciprocal of the shorter IRT band at .70 but varied the 
absolute values of the two reinforced IRTs. Relative 
rate of emission of the two IRTs only matched the 
relative reciprocal of their lengths when the shorter 
IR'T band was between 1.5 and 2.5 sec. This was 
roughly the lower bound of the shorter class of IRTs 
used in previous experiments that reported matching 
to the relative reciprocal (Moffitt & Shimp, 1971; 
Shimp, 1969b, 1971). When the lower bound of the 
shorter IRT band was less than 1.0 sec the pigeons 
were nearly indifferent between the two IRTs. And 
when both of the reinforced IRTs were longer than 
2.5 sec the pigeons showed greater preference for the 
shorter IR'T than was predicted by relative reciprocal 
matching. 

To summarize, the matching relation applies in a 
wide range of choice situations, to several different 
parameters of reinforcement besides frequency, and to 
choice between several different schedules of rein- 
forcement besides VIs. In concurrent VI FI and paced 
schedules, however, the slope of the function relating 
choice and reinforcement for each alternative differs 
from that predicted by matching, and even the gen- 
eralized matching equations including a bias or pref- 
erence parameter (Equations 4 and 5) do not handle 
the data. They account for deviations in the intercept 
of the function from that predicted by matching—i.e., 
for a consistent preference for one alternative—but 
not for deviations in slope from 1.0. In the next sec- 
tion of the chapter I shall consider how a quantitative 
formulation of the relation between response strength 
and reinforcement, derived from the matching rela- 
tion, could account for the deviations from 1.0 slope 
found in these studies. 
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ABSOLUTE RATES OF RESPONDING AND 
A QUANTITATIVE LAW OF EFFECT 


We have seen that in a wide range of choice situa- 
tions the relative rate of responding is directly deter- 
mined by the obtained relative frequency of reinforce- 
ment. However, the matching relationship between 
response proportions or ratios and the corresponding 
reinforcement proportions or ratios may not be the 
most fundamental way to quantify the relation be- 
tween response strength and reinforcement. Response 
and reinforcement proportions or ratios remain in- 
variant over large changes in overall response rate or 
reinforcement frequency. What about the absolute 
response rates in choice situations? 

Herrnstein (1970) reasoned that the relative fre- 
quency of reinforcement should determine not only 
the relative response rates but also the absolute rates 
of responding. As Herrnstein pointed out, at every 
moment of possible action a set of alternative re- 
sponses confronts the animal, so that each action is 
the outcome of a choice. No matter how the experi- 
menter tries to control the extraneous sources of 
reinforcement for responses other than those stipu- 
lated in the experiment, the subject will always have 
distractions available, even if they are merely con- 
cerned with its own body or physiological processes. 
Thus even on single-response procedures the subject 
is in a concurrent situation, although the experi- 
menter may monitor only one of the alternative re- 
sponses and reinforcers. 

Absolute response rate on single and concurrent 
schedules can therefore be considered a function of 
the frequency of reinforcement for that response rela- 
tive to all the other sources of reinforcement for com- 
peting responses. Herrnstein (1970, 1971) suggested a 
direct proportionality between the overall relative 
frequency of reinforcement and the absolute rate of 
responding. For a situation containing n alternative 
sources of reinforcement he proposed a general equa- 
tion of the form 


kr 
Sat; 
10 


where R, is the rate of emission of the stipulated re- 
sponse and 1, is the frequency or magnitude of rein- 
forcement for that response. The parameter k repre- 
sents the asymptotic response rate in the absence of 
any reinforcement for competing response—i.e., when 
r, equals 3 7,, the total amount of reinforcement from 
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all sources in the situation. The parameter k can be 
thought of as the total amount of behavior sustained 
by all the reinforcement available to the animal in the 
experimental situation. It is measured in the same 
units as the stipulated response—e.g., in responses per 
min or running speed units’ (see Herrnstein, 1974). 
The relation of Equation 9 to the matching equation 
then becomes clear: 
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Single Schedules—Frequency of Food Reinforcement 


In a single-response procedure, Equation 9 becomes 


kr 
Re (11) 
T+ 7%, 


where R,, r,, and & are as specified for Equation 9. 
The parameter r, represents the total reinforcement 
besides r, in the experimental situation®—i.e., all the 
other reinforcers that a subject brings with itself or 
finds in the experimental setting. The equation as- 
sumes that they can be given a value in terms of the 
units of reinforcement contingent on R,; thus r, is 
measured in terms of the same units as 7), a frequency, 
amount, or concentration of reinforcement (Herrn- 
stein, 1974). 

Herrnstein (1970) tested this equation with data 
from two experiments investigating the effects of fre- 
quency of food on key-peck rate in pigeons. The best 
data come from an exhaustive study of single VI 
schedules by Catania and Reynolds (1968). Six pigeons 
were exposed to VI schedules with frequencies of rein- 
forcement ranging from 8 to 300 reinforcements per 


6In his 1970 paper “On the Law of Effect,’ Herrnstein used 
the expression 7. to mean all reinforcements besides 7i—i.e., 
> 7: — m= T1o. But in his later paper (Herrnstein, 1974), he used 
ro in a more restrictive sense of reinforcements that come 
spontaneously and are not conditional upon any responses. 
Therefore, re is used to denote more generally all reinforcements 
spontaneous or contingent on responses other than ri—i.e., all 
extraneous sources of reinforcement (Herrnstein & Loveland, 
1974). I shall follow the more recent usage of re throughout the 
chapter. 
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Fig. 8. Rate of responding as a function of frequency of food 
reinforcement for six pigeons responding on single VI schedules. 
The least-squares fit of Equation 11 to the data is plotted for 
each pigeon. The k- and r,-values and the percentage of the 
variance in response rate accounted for by the functions are 
also shown. (Data from Catania and Reynolds, (1968) as plotted 
by Herrnstein, R. J. On the law of effect. Journal of the Experi- 
mental Analysis of Behavior, 1970. © 1970 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


hr. Figure 8 shows the least-squares fit of Equation 11 
to the data from each of the pigeons. The values of 
k and r, and the percentage of data variance ac- 
counted for by the equation in each case are given in 
the bottom right-hand corner of each panel. With a k 
of 66.3 responses per min and an r, of 7.3 reinforce- 
ments per hr, the equation also accounts for 91.3% of 
the data variance when response rate is averaged 
across the pigeons for each VI value. 

Chung (1966) reinforced pigeons’ responses on a 
tandem FR 1 FIx schedule, where x represents a given 
duration after the first postreinforcement response. 
The first response after the postreinforcement pause 
started an FI timer, and the first response after the 
timer timed out produced food reinforcement. The 
length of the FI was varied, determining both rate of 
responding and rate of reinforcement. By this method, 
reinforcement rates of up to 2,000 per hr were ob- 
tained, many times the maximum value investigated 
by Catania and Reynolds (1968). Herrnstein (1970) 
demonstrated that Equation 11 also accounts for 
Chung’s results, though with somewhat higher param- 


eter values than those for Catania and Reynolds’ 
subjects. With a k of 130 pecks per min and r, of 210 
reinforcements per hr, Equation 11 accounts for 94.7% 
of the variance in mean response rate for Chung’s 
pigeons. 

Can the equation be extended to other parameters 
of reinforcement besides frequency, and to other mea- 
sures of response strength besides rate of key pecking? 
De Villiers and Herrnstein (in press) have carried out 
a comprehensive review of the literature on the rela- 
tion between several measures of response strength 
(e.g., running speed in an alley or latency to respond 
in a discrete-trial situation, as well as the rate of emis- 
sion of repetitive responses like key pecking or lever 
pressing) and several parameters of both positive and 
negative reinforcement. For each study the data were 
either taken from tables or estimated from figures. A 
least-squares fit of Equation 11 to the data from 
groups or individual subjects was performed, and the 
percentage of the variance in the dependent variable 
accounted for by the equation was calculated.? 


Magnitude of Food Reinforcement 


The most extensive study of magnitude of food 
reinforcement is that by Crespi (1942). Five groups of 
rats, 12 to 20 to a group, ran down a straight alley 
for different weights of dog chow, ranging from .02 to 
5.12 grams. Equation 11 describes remarkably well the 
relation between the quantity of food and the mean 
running speed (1/time in sec) for each group of rats, 
accounting for 99.6% of the variance in running 
speed. Appendix A summarizes this and several other 
studies investigating the effects of magnitude of food 
reinforcement on response strength. In the study of 
Davenport, Goodrich, and Hagguist (1966), individual 
data were also available for four macaque monkeys 
lever-pressing for a varying number of pellets on a 
VI I-min schedule. Once again the equation provides 
an excellent fit to the data, accounting for 99.9, 90.4, 
90.1, and 96.6% of the variance in response rate for 
each monkey. For six of the remaining studies in 
Appendix A (Beier, 1958; Di Lollo, 1964; Hutt, 1954; 
Keesey & Kling, 1961, Experiment II; Logan, 1960, 12- 
hour group; Zeaman, 1949), which show group data 
from rats and pigeons, Equation 11 accounts for over 


7 A digital computer iteratively calculated the mean-squared 
deviation of obtained response rates from those predicted by 
Equation 11 for a wide range of k- and r--values varying on 
either side of those derived from a best fit by eye. The smallest 
mean-squared deviation thus obtained was subtracted from the 
total variance in response rate and the result divided by the 
total variance to determine the percentage of data variance 
accounted for by the equation. 
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90% of the variance in the measure of response 
strength in five cases, and for over 80% in the sixth. 
Since k in runway studies is measured in response 
speed units, 100/time in seconds to run the runway, it 
varies with runway length. Hence the &-values are not 
usually comparable across studies. 

There are three exceptions to the remarkably good 
fit of Equation 11 to the data on magnitude of rein- 
forcement. Keesey and Kling (1961, Experiment 1) 
studied four pigeons key-pecking for different-sized 
chick-peas (varying from four quarter-peas to four 
whole peas) on a VI 4-min schedule. Catania (1963a) 
studied two pigeons key-pecking for three different 
durations of grain reinforcement (3, 4.5, and 6 sec) on 
a VI 2-min. Logan (1960, Experiment 55B) studied six 
rats per group receiving varying numbers of food pel- 
lets (1, 3, 6, and 12) for running down a straight alley. 
At 12 hr of food deprivation, running speeds of the 
groups conformed to Equation 11; but at 48 hr of 
deprivation running speed ceased to be a monotonic 
function of number of pellets. In the other two devi- 
ant studies (Catania, 1963a; Keesey & Kling, 1961, 
Experiment I) the experimenters observed minimal 
changes in response rate with increases in reinforce- 
ment magnitude. Two factors could possibly account 
for the insensitivity of key-peck rate to variations in 
reinforcement in these two studies. First, only a very 
narrow range of magnitudes was investigated, no- 
where near the range studied by Crespi (1942) or the 
range of reinforcement in studies of frequency of food 
(Catania & Reynolds, 1968). Second, the pigeons were 
studied at high-drive levels in both experiments (as 
were the rats in the disconfirming condition of Lo- 
gan’s study). Herrnstein (1970) has argued that the 
reinforcement from sources other than r,—namely, r, 
—will be extremely small relative to 1, itself under con- 
ditions of high drive. Response rates would therefore 
be very close to k over a fairly wide range of 7,-values, 
since r, would approximate > r;. The data variance in 
both pigeon studies was minimal. 

In brief, Herrnstein’s equation can be readily ex- 
tended to the relation between response strength and 
magnitude of food reinforcement for several responses 
and species, though with a few exceptions. 


Brain Stimulation 


The magnitude of reinforcement from intracranial 
brain stimulation varies with several parameters of 
stimulation—e.g., the intensity, duration, or pulse fre- 
quency of the stimulation. But variations in response 
rate for brain stimulation on continuous reinforce- 
ment schedules do not provide an accurate measure of 
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the magnitude of reinforcement, since the frequency 
of brain stimulation covaries with response rate on 
these schedules. At higher intensities or durations of 
stimulation, motor effects of the brain stimulation also 
tend to interfere with responding. Keesey (1962, 1964) 
therefore used a VI 16-sec schedule to investigate the 
effects of several parameters of stimulation of the pos- 
terior hypothalamus on rate of lever pressing in rats. 
Over a wide range of values, Equation 1] accurately 
depicts the relation between response rate and either 
the duration, intensity, or pulse frequency of the 
brain stimulation, accounting for 94.1, 92.9, and 
96.9% of the data variance, respectively. The least- 
squares fit to the data from both of Keesey’s studies is 
given in Appendix B. 

Gallistel (1969) argued that lever-press rate on a VI 
schedule of brain stimulation might not be a valid 
measure of the magnitude of reinforcement, since re- 
sponse rate is also sensitive to the time since the last 
stimulation. To control for any variation in the after- 
effects of the last brain stimulation, Gallistel gave all 
his subjects a series of 10 priming stimulations before 
they ran down a straight alley. Any differences in run- 
ning speed should then be a function of the magni- 
tude of the rewarding stimulation in the goal box. 
Nine rats, implanted in three different rewarding 
areas of the brain, ran the straight alley for a varying 
number of .0l-msec pulses of electrical stimulation. 
Appendix B shows that the least-squares fit of Equa- 
tion 11 accounts for a substantial proportion of the 
variance in running speed for each rat, even in some 
cases where there was little variation in running speed. 
The equation accounts for the behavior very well (i.e., 
better than 90%) for five out of the nine subjects. 


Quality of Reinforcement 


Numerous experiments have investigated the effects 
of different sugar concentrations on response strength. 
Guttman (1954) studied seven concentrations of su- 
crose (between 2 and 32%) and the same seven con- 
centrations of glucose with rats lever-pressing on a 
VI I-min schedule. The rats responded faster for 
sucrose than for glucose at each concentration, but 
Equation 11 accurately describes the relation between 
response rate and concentration for both reinforcers, 
accounting for 93.7 and 98.7% of the data variance. 

Appendix C summarizes the data from this and 
several other studies on concentration of sucrose. In 
Guttman’s (1953) experiment two different conditions 
were studied. In one a different group of about 20 
rats was exposed to each concentration; in the other, 
20 rats were exposed to all four concentrations. In 
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both cases the equation accounts for over 90% of the 
variance in mean response rate. In analyzing the data 
from rhesus monkeys obtained by Conrad and Sidman 
(1956), response rates for the 60% sucrose solution 
were excluded because the experimenters report con- 
siderable satiation at this concentration. ‘Two drive 
levels were used: 48 and 72 hr of food deprivation. As 
in the case of Logan’s (1960) experiment on magni- 
tude of reinforcement, the equation accounts for 
rather more of the variance in response rate at the 
lower drive level (96.2 versus 71.8%). 

A pair of more extensive studies by Schrier (1963, 
1965), also with rhesus monkeys and a _ lever-press 
response, used considerably shorter sessions to avoid 
satiation effects. Over a range of concentrations from 
10 to 50%, the equation fits both the average and 
individual data remarkably well. For 10 of the 14 
available individual functions, the variance accounted 
for was greater than 90%, and for none did it fall be- 
low 70%. For the three separate group averages, Equa- 
tion 11 accounts for over 95% of the variance in each 
case. 


Immediacy of Positive Reinforcement (1 /Delay) 


‘The most comprehensive data on delay of reinforce- 
ment in a single-response situation come from Pierce, 
Hanford, and Zimmerman (1972). Four rats respond- 
ing on a VI I-min schedule for food experienced 
delays of reinforcement varying from .5 to 100 sec. Dur- 
ing the delay a cue light was illuminated and respond- 
ing had no programmed consequences. Equation 1] 
again provides an accurate description of the results 
of Pierce et al., with immediacy of reinforcement 
(1/delay) as 7. It accounts for 96.1% of the variance 
in mean response rate and for 80.8, 98.3, 78.9, and 
97.4% of the variance for each of the four rats. The 
same rats were studied in a second condition in which 
the lever was retracted during the delay of reinforce- 
ment. The equation fits the data from this condition 
as well, accounting for 95.0%, of the variance in mean 
response rate and for 97.8, 98.6, 94.9, and 92.9% of the 
individual data variances. 

Appendix D gives the least-squares fit of the equa- 
tion to several other studies on delay of reinforcement 
as well as that of Pierce et al. ‘(he Perin (1943) experi- 
ment is noteworthy in that it measured latency to 
lever press in a discrete-trial procedure as a function 
of delay of reinforcement. The equation here accounts 
for the relation between speed of responding (100/ 
latency in sec) and immediacy of food presentation 
(1/delay). ‘Ihe remaining studies (Logan, 1960; Silver 
& Pierce, 1969) summarized in Appendix D also con- 


form substantially to Equation 11. Both experiments 
used rats as subjects and food as reinforcement, but 
Logan measured speed in a runway and Silver and 
Pierce measured rate of lever pressing. In each case, 
including both high- and low-drive conditions in 
Logan’s experiment, the equation accounts for over 
90% of the variance in group data. 


Frequency of Negative Reinforcement 


De Villiers (1974) demonstrated that Herrnstein’s 
equation for absolute response strength in single 
schedules of positive reinforcement can be extended 
to VI avoidance schedules, with shock-frequency re- 
duction (shocks avoided per min) as the reinforcer for 
avoidance (Herrnstein, 1969; Herrnstein & Hineline, 
1966). In de Villiers’s experiment, lever-press re- 
sponses canceled the delivery of shocks scheduled at 
variable intervals. If no lever press was made, all of 
the scheduled shocks were presented. The first re- 
sponse made after a scheduled shock, whether or not 
that shock had been presented, prevented the delivery 
of only the next scheduled shock. Extra responses 
between two scheduled shocks did not avoid further 
shocks. All the scheduled shocks could therefore be 
avoided if the rat responded at least once within every 
intershock interval, but the durations of the intervals 
varied unpredictably. On this VI avoidance schedule 
both received shock rate and shock-frequency reduc- 
tion (scheduled shock rate minus received shock rate) 
can be measured, and response rate is not constrained 
by any fixed temporal relations between responses and 
shocks, as it is on the free operant avoidance sched- 
ules usually studied (Sidman, 1953, 1966). 

Four rats were exposed to an ascending and de- 
scending series of VI avoidance schedules ranging 
from a VI 15-sec (four programmed shocks per min) 
to a VI 75-sec (.8 programmed shocks per min). A wide 
range of response rates was produced by each rat on 
the different schedules. Response rates are plotted 
against shock-frequency reduction (shocks avoided per 
min) for each rat in Figure 9. The least-squares fit of 
Equation 11 is shown for each animal. Where there 
were two determinations of response rate for a given 
VI value, the mean of the two was used in calculating 
the best fit of the equation. Equation 11 accounts for 
over 95% of the variance in response rate for each rat 
(see Figure 9). Reinforcement from extraneous sources, 
r,, in VI avoidance schedules represents the reinforce- 
ments for freezing, crouching, defecating, etc. scaled 
in terms of their equivalent value in shocks avoided 
per min, the reinforcer for the avoidance response. 
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Fig. 9. Rate of responding as a function of shock-frequency 
reduction for four rats responding on single VI avoidance 
schedules. The least-squares fit of Equation 11 to the data is 
plotted for each rat. The k and r, values and the percentage 
of the variance in response rate accounted for by the equation 
are also shown. Filled circles represent the mean response rate 
of two determinations for a given VI value; open circles repre- 
sent single determinations. (de Villiers, 1974.) 


Magnitude of Negative Reinforcement 


Several experiments have investigated the effects of 
different amounts of voltage reduction between the 
alley and goal box on a rat’s running speed in a 
straight runway. Three of these studies (Bower, Fowler, 
& ‘Trapold, 1959; Campbell & Kraeling, 1953; Seward, 
Shea, Uyeda, & Raskin, 1960) are summarized in Ap- 
pendix E. All of them used different groups of rats for 
each voltage reduction value. In fitting Equation 1], 
de Villiers and Herrnstein (in press) took running 
speed as the response measure and the reduction in 
voltage as the reinforcer. Extraneous reinforcement r, 
is expressed as a voltage reduction, its value scaled in 
the units of the independent variable. Thus in the 
study of Bower et al. (1959) the value of the reinforce- 
ment for competing responses such as freezing or 
defecating is equivalent to a voltage reduction of 
338 V. For all three experiments, Equation I1 ac- 
counts for over 85% of the data variance. 

Woods and his co-workers (1965, 1966) used an in- 
teresting variant of the runway escape procedure. 
Their rats swam down a straight alley filled with very 
cold water in order to be placed in a goal box con- 
taining much warmer water. The independent vari- 
able was the increase in water temperature between 
the alley and goal box; the dependent variable was 
swimming speed (100/time in sec). Equation 11 ac- 
counts for these data very well, though it is not clear 
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why parameter values in the two studies differ by as 
much as they do. 

Dinsmoor and Hughes (1956) and Harrison and 
Abelson (1959) varied the duration of time-out from 
aversive stimulation contingent on a lever-press escape 
response. Dinsmoor and Hughes measured latency to 
lever press in a discrete-trial escape procedure and 
observed a monotonically increasing function relating 
speed of response (100/latency in sec) and duration of 
time-out. The equation accounts for 99.5% of the 
variance in speed of response for the rats that were 
escaping from a .2-mA shock. For the rats escaping 
from the stronger shock—.4 mA—variance in response 
speed was small and only about 75% of it was ac- 
counted for by the equation. Harrison and Abelson 
measured response rate on VI escape from a loud 
noise. ‘They found substantial order effects and ex- 
posed only one rat to complete ascending and descend- 
ing orders, but if response rates are averaged across 
the several determinations for each time-out duration, 
over 90% of the variance is accounted for by Equa- 
tion 11. 


Immediacy of Negative Reinforcement 


Fowler and Trapold (1962) conducted a thorough 
study of delay of escape in a straight runway. Delay 
of voltage reduction in the goal box varied over five 
values between 1 and 16 sec for groups of five rats 
each. The observed relation between running speed 
in the runway (100/time in sec) and immediacy of 
escape (1/delay in sec) in the goal box conforms to 
Equation 11, 92.6% of the variance in running speed 
being accounted for by the equation with a k of 94.0 
(running speed units) and an r, of .06 (1/delay units). 
These results, together with other data from experi- 
ments varying delay of negative reinforcement, appear 
in Appendix F. 

Tarpy (1969) had rats escape from electric shock 
by pressing either of two levers in a discrete-trial pro- 
cedure. ‘The experiment’s main purpose was to investi- 
gate preference as a function of differential delays be- 
tween a press on either lever and shutting off the 
shock. But de Villiers and Herrnstein (in press) tested 
Equation 1] with data from conditions in which the 
delay of escape was the same for both levers; 89.5% of 
the variance in response speed (100/latency in sec) for 
five delays between | and 16 sec (excluding the 0-delay 
condition) is accounted for. ‘Tarpy and Koster (1970) 
varied delay of discrete-trial escape from electric shock 
by rats responding on a single lever. Using the three 
nonzero-delay values, 94.6% of the variance in re- 
sponse speed is accounted for by Equation 11. Leem- 


262 CHOICE IN CONCURRENT SCHEDULES AND A QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


ing and Robinson (1973) also varied delay of escape 
from electric shock by rats, but the escape response 
was running from one compartment to the other in a 
shuttlebox. The equation accounts for 86.0% of the 
variance in speed to respond (100/latency in sec). 

Finally, Appendix F also contains Moffatt and 
Koch’s (1973) data from human subjects listening to 
a Bill Cosby comedy record. Occasionally the record- 
ing would stop and the subject could restart it by 
pressing a panel. The primary independent variable 
was the delay between the panel response and the on- 
set of the recording. For the three nonzero delays 
studied, Equation 11 accounts for over 95%, of the 
variance in speed of panel depression. 


Summary 


The remarkable generality of Herrnstein’s equation 
is apparent from this survey. The behavior of rats, 
pigeons, monkeys, and (in the one case we found) 
pegple, is equally well accounted for, whether the 
behavior is lever pressing, key pecking, running speed, 
or response latency in a variety of experimental sct- 
tings. The reinforcers can be as different as food, 
sugar water, escape from shock or loud noise or cold 
water, clectrical stimulation of a variety of brain loci, 
or turning a comedy record back on. Out of 53 tests 
of Equation 1] on group data, the least-squares fit of 
the equation accounts for over 90%, of the variance in 
the dependent variable in 42 cases, and for over 80%, 
in another six cases. Out of 45 tests on individual 
data, the equation accounts for over 90% of the 
variance in 32 cases, and for over 80° in another 
seven cases. ‘I'he literature appears to contain no cvi- 
dence for a substantially different equation than 
Equation 11. Where the equation fails to account for 
most of the data variance, that variance is negligible 
in all cases but one. In the one exception (Logan, 
1960, Experiment 55B) supposedly asymptotic run- 
ning speed was a sharply nonmonotonic function of 
number of food pellets. Other experiments on pellet 
number, including one by Logan (1960), found mono- 
tonic functions, all conforming to Equation 11. This 
equation therefore provides a powerful but simple 
framework for the quantification of the relation be- 
tween response strength and both positive and nega- 
tive reinforcement. 

Perhaps the most surprising feature of the close fit 
of the equation to the data is that it did not require 
any ad hoc transformations of the reinforcement vari- 
ables. For example, studies of brain stimulation (Ap- 
pendix B) or sucrose reinforcement (Appendix C) are 
well accounted for by Equation 11 by using ordinary 


physical scales for the independent variables, such as 
electrical pulses per second or percentage concentra- 
tion by weight. In contrast, studies in psychophysics 
indicate that many dimensions of experience are not 
accounted for by the physical scales, unless they are 
transformed (Stevens, 1957). At least two explanations 
of the simplicity of the independent variables are 
possible. 

First, some of the simplicity may reflect the rela- 
tively small ranges of the independent variable used 
in the studies. The range from 1 to 9 food pellets or 
.28 to 2.7 msec of brain stimulation is not quite an 
order of magnitude, while the usual psychophysical 
experiment uses several orders of magnitude. If such 
broad ranges of reinforcement could be used in stud- 
ies of animal behavior, rescaling of the reinforcement 
variable might become necessary or a different equa- 
tion might need to be formulated, 

Second, several of the independent variables sur- 
veyed here have been shown to produce matching of 
relative response rate to relative reinforcement (see 
Equation 1) in concurrent schedules, Variables that 
obey Equation 1| should also obey Equation 11 (see pp. 
257 and 266). Therefore the fit of Equation 11 to 
the data on frequency, magnitude, and immediacy of 
reinforcement only further confirms the matching re- 
lation for choice. 


The Constancy of the k-parameter 


In a recent paper Herrnstein (1974) suggests that 
the equation can be made even stronger. He argues 
that a formal implication of the matching relation is 
that for any given response the parameter & must re- 
main constant across different qualities or quantities 
of reinforcement or changes in the animal’s drive 
level, as long as the response topography does not 
change. The k-parameter is just a measure of behav- 
ior, such as responses per min, and represents the 
amount of behavior the observed response would 
show if there were no reinforcement for competing 
responses. When there are other sources of reinforce- 
ment, k measures the total frequency of all responses 
in units commensurate with the measure of the re- 
sponse being studied. The k-parameter is therefore 
sumply “the modulus for measuring behavior” (Herrn- 
stein, 1974), and the sole influence on its size is the 
chosen response form itself. 

Several of the studies discussed above test the con- 
stancy of k across different qualities and quantities of 


reinforcement or across different drive levels. Only 


cases 1n which over 90% of the variance was ac- 
counted for by Equation 11 will be considered here, 
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since the value of k is then well specified. When the 
equation was fitted to Guttman’s (1954) data on the 
same eight rats lever-pressing for either sucrose or 
glucose, practically the same k-value was derived for 
the two reinforcers (15.6 versus 16.1 responses per 
min). The reinforcement from other sources (r,) was 
much smaller for the sucrose function (7.1% versus 
11.0%), in keeping with rats’ preference for sucrose, 
the sweeter solution. Kraeling (1961) ran different 
groups of rats in a runway for sucrose solutions of 
varying concentration. ‘Three separate functions relat- 
ing running speed to sucrose concentration were 
found: one for each of three different durations of 
access to the sucrose in the goal box. The three 
k-values given by the least-squares fit of Equation I] 
were 89.4, 87.9, and 88.7 running speed units, impres- 
sive confirmation of the constancy of k considering 
that different groups of rats ran in the different condi- 
tions. Logan (1960) ran groups of rats under both 
high- and low-drive levels (hours of deprivation) for 
different immediacies of food reinforcement. The 
k-values fitted to his data are 64.9 and 64.5. In the 
experiment of Seward et al. (1960), rats ran down an 
electrified runway for a reduction in voltage between 
the alley and goal box. For two different alley volt- 
ages, k-values of 161 and 146 running speed units 
were derived, about a 10% difference in & across drive 
levels. In the comparable experiment on escape from 
cold water (Woods & Holland, 1966), the rats swam 
from cold water to warmer water, and temperature in- 
crease was the reinforcement variable. For two differ- 
ent alley temperatures the k-values were 108 and 114 
swimming speed units, an even better agreement be- 
tween &-values than that found by Seward et al. (1960) 
for electric shock. 

While the above studies support Herrnstein’s hy- 
pothesis that & remains constant as reinforcement and 
drive values change, some of the data analyzed do not. 
In Keesey’s (1964) experiment the same 10 rats lever- 
pressed on a VI 16-sec schedule for different durations 
of brain stimulation under two intensity conditions, 
3.0 and 1.5 mA. In this case, the least-squares fit of 
Equation 11 produced a much larger k for the 3.0-mA 
condition (21.0 versus 14.9 responses per min), while 
r,-values were very similar. In Schrier’s (1965) experi- 
ment on monkeys lever-pressing for five different con- 
centrations of sucrose, two different quantities, .33 
and .83 cc, were investigated. The k-values for the 
mean response rates of the monkeys were 88 and 66.5 
responses per min in the two conditions. Individual 
k-values also showed considerable variation across the 
two quantities (see Appendix C). Finally, the largest 
discrepancy was found in the experiment by Camp- 
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bell and Kraeling (1953), in which rats ran down an 
electrified runway for reduction in voltage between 
the alley and goal box. The k-value derived for the 
more intense alley shock condition was over twice that 
derived for the less intense shock condition: 228 to 
106 running speed units. ‘The data on the constancy 
of k therefore remains equivocal, and further experi- 
ments must examine the range of conditions under 
which k& does or does not remain constant. 


ALTERNATIVE THEORIES OF 
RESPONSE STRENGTH 


Shimp (1974) has argued against Herrnstein’s equa- 
tion as an adequate theory of response strength in VI 
schedules on the grounds that it neglects the role of 
the distribution of reinforced interresponse times 
(IRT’s) in determining the overall response rate. 
Shimp points out that the mean rate of responding is 
the reciprocal of the mean of a distribution of IRTs, 
which is in turn determined by the distribution of 
reinforcements for those IRTs. He suggests that the 
pigeon in fact distributes its time among different 
IRTs according to how frequently each is reinforced 
relative to the others; i.e., the basic response is one of 
pausing between key pecks rather than key pecking 
per se. 

Shimp (1974) studied three pigeons responding on 
a “‘synthetic” VI schedule (Anger, 1973; Shimp, 1973). 
Food reinforcement for key pecking was arranged by 
a single VI schedule and a programming device that 
assigned the reinforcements equally to each of ten 
classes of IRTs ranging from 1.0 to 6.0 sec in .5-sec 
classes. Overall reinforcement frequency was varied 
between | and 70 per hr by changing the VI schedule. 
Overall response rate on this schedule was a mono- 
tonically increasing, negatively accelerated function of 
overall reinforcement frequency, much like that ob- 
served with normal VI schedules by Catania and 
Reynolds (1968). Shimp argued that this overall re- 
sponse rate function could be decomposed into two 
time-allocation functions: (1) the time allocated by 
the pigeon to all of the reinforced IRTs as a function 
of overall reinforcement frequency—i.e., the percent- 
age of session time taken up by all of the IRTs emit- 
ted in the range from 1.0 to 6.0 sec; and (2) the time 
allocated to any particular class of IRTs as a function 
of the reinforcement rate obtained by that class of 
IR Ts—1.e., the number of emissions of that IRT class 
times its lower bound duration. At asymptotic overall 
response rate (over 20 reinforcements per hr) the rela- 
tive amount of time allocated to a particular IRT 
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class roughly matched its relative frequency of rein- 
forcement. The pigeons spent about the same amount 
of time responding in each IRT class since each was 
assigned the same frequency of food; therefore, they 
made fewer of the long IRTs and more of the shorter 
ones. ‘The asymptotic overall response rate in re- 
sponses per min could therefore be predicted by a 
combination of the percentage of session time taken 
up by IRT’s between 1.0 and 6.0 sec and the frequency 
of emission of each of the IRT classes in that range. 
This asymptote varies with the distribution of rein- 
forcements across IRT classes; thus Shimp argues that 
there may be no such thing as a constant asymptotic 
response rate across reinforcement and drive condi- 
tions, as required by Herrnstein’s theory. Given the 
distribution of reinforced IRTs, the experimenter 
could in fact predict the asymptotic response rate, 
since pigeons match the time allocated to each IRT 
class to its relative reinforcement frequency (pro- 
vided that the overall reinforcement rate is above 20 
per hr). 

In assessing Shimp’s argument against Herrnstein’s 
molar formulation, however, several considerations 
should be taken into account. First, while Shimp 
(1973, 1974) and Anger (1956, 1973) have shown that 
an animal’s behavior is sensitive to the differential 
reinforcement of particular IRTs, the extension of 
their analysis to responding on normal VI schedules is 
unclear, On VI schedules with a constant probability 
of reinforcement over time the differential reinforce- 
ment of IRTs is minimal unless the animal has a 
propensity to emit IRTs of a particular duration. 
Since pigeons tend to emit IRTs of about .3 to .5 sec 
duration (Blough, 1963), those IRTs may be differen- 
tially reinforced, but such an effect has not been con- 
clusively demonstrated. 

Second, Shimp’s synthetic VI schedule had impor- 
tant differences from the normal VI. In his procedure 
the same 10 IRT classes were reinforced equally fre- 
quently at all VI values; but if differential reinforce- 
ment of IRIs takes place in a normal VI, different 
VI schedules should differentially reinforce different 
IR'T’s. Each mean response rate on the function relat- 
ing responding to reinforcement frequency would 
therefore be the outcome of a different time-allocation 
function. Herrnstein’s quantitative formulation ac- 
counts for the entire function, not only its asymptote. 

Finally, there seems to be no way to extend Shimp’s 
account of responding on VI schedules to the great 
wealth of data on other measures of response strength 
(e.g., running speed or latency) that Herrnstein’s 
equation accounts for so well. 

Catania (1973) has proposed an alternative equa- 


tion for response strength based on assumptions about 
the excitatory and inhibitory effects of reinforcement 
on behavior. Whereas Herrnstein assumes that asymp- 
totic response rate (k) for a given response is constant 
across different drive and reinforcement conditions, 
Catania assumes that responding increases linearly 
with increasing reinforcement, 


R, = Kr, (12) 


This effect combines with an inhibitory effect of total 
reinforcement on the response, whether the reinforcers 
originate from other sources or from the response it- 
self (Catania, 1963b, 1969; Rachlin & Baum, 1969, 
1972): 
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where C is a constant that depends on the magnitude 
of the inhibitory effect of the reinforcers, and > r is 
the total reinforcement obtained from all sources. 
Setting KC to a new constant k gives the following 
equation: 
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Here 21 includes only the specified sources of rein- 
forcement, not extraneous unspecified sources as in 
Herrnstein’s 7. Thus when r, is the only scheduled 
reinforcement, Yr=yr, and Equation 14 is mathe- 
matically equivalent to Herrnstein’s equation for 
single-response situations. The constant C is derived 
from the data, as is 7, Catania’s equation therefore 
accounts for as much of the data summarized in the 
earlier sections of the chapter as does Herrnstein’s 
equation. 

While there is not yet empirical data to distinguish 
between these two formulations, Catania (1973) argues 
that Herrnstein’s equation assumes that the variety of 
reinforcers subsumed under r, do not interact in any 
complex way with 7,, the scheduled reinforcer. But 
there is evidence for interaction between such rein- 
forcers as food and water (Bolles, 1967). However, 
Catania’s equation also fails to account for the data 
on choice between food and water reinforcers. In the 
experiment of Wood et al. (1975), rate of water rein- 
forcement on an FR schedule did not affect rate of 
responding on a concurrently scheduled FI schedule 
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for food, as it should if food and water simply sum in 
x7 in Equation 14. Therefore, neither equation read- 
ily handles interactions between reinforcers. 


APPLICATION OF HERRNSTEIN’S 
EQUATIONS TO OTHER SCHEDULES 


Herrnstein’s equation for single schedules, Equa- 
tion 11, applies best to VI schedules, in which the 
probability of a reinforcement being scheduled after 
one has been presented is approximately constant 
over time. Its extension to other schedules, such as 
FIs, is more difficult. Overall response rate on differ- 
ent FI schedules typically does not fit the equation 
very well. However, Schneider (1969) has argued for 
a two-state analysis of well-learned FI performance. In 
the first state, beginning immediately after reinforce- 
ment, response rate is very low and approximately 
constant over different FI values. At some variable 
time, on the average about two-thirds of the way 
through the FI, there is an abrupt transition (or 
break point) to a high and approximately constant re- 
sponse rate. Rate of responding in the second state is 
an increasing, negatively accelerated function of rein- 
forcement frequency in that state, much like the func- 
tion obtained for VI schedules. Schneider suggests that 
in a sense the pigeon is on a VI schedule in the second 
state, the interreinforcement intervals being deter- 
mined by the bird’s break point distribution. Herrn- 
stein’s equation actually provides an accurate account 
of second-state response rate in Schneider’s experi- 
ment, though with somewhat higher parameter values 
than those usually found for pigeons on VI schedules. 
With a & of 147 responses per min and an r, of 24 
reinforcements per hr, the equation accounts for 
92.2% of the variance in mean second-state response 
rate for Schneider’s pigeons. 

Several theorists have argued that the operation of 
at least two factors determines the pattern and rate of 
responding on schedules of intermittent reinforce- 
ment. Morse (1966) proposed that most schedule- 
controlled behavior results from the joint effects of 
reinforcement on response strength and the differen- 
tial reinforcement of certain interresponse times 
(IR'T's). ‘Thus ratio schedules tend to maintain higher 
response rates than interval schedules for the same 
frequency of reinforcement possibly because there is 
more selection for long IRTs on interval than on 
ratio schedules (Killeen, 1969). Staddon (1972) simi- 
larly argued that responding on intermittent rein- 
forcement schedules is determined by two factors: the 
relative frequency of reinforcement and the relative 
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proximity of responses to reinforcement. The latter 
principle determines the temporal location of be- 
havior under schedule control, accounting for the 
temporal pattern of responding typically found on FI 
and FR schedules, in which the relative proximity to 
reinforcement increases with time since the last rein- 
forcement. On VI schedules with a random distribu- 
tion of intervals, differential reinforcement of par- 
ticular IR'T’s 1s minimized and the relative proximity 
of responses to reinforcement does not vary with post- 
reinforcement time. Consequently, response rate is 
fairly constant over time since reinforcement and is 
directly related to relative reinforcement frequency 
(see p. 258) (Catania & Reynolds, 1968). Staddon 
showed that these two factors can qualitatively ac- 
count for many of the properties of behavior on FI, 
VI, FR, and VR schedules. 

‘The data considered in the earlier sections of this 
chapter suggest that Herrnstein’s equation provides 
an accurate quantitative formulation of one aspect of 
schedule-controlled behavior: the relation between 
response strength and relative reinforcement fre- 
quency or magnitude. Further research must specify 
the way this factor interacts quantitatively with other 
factors such as the relative proximity to reinforcement 
or differential reinforcement of IRTs to determine 
behavior on any given schedule. 


Concurrent Schedules 


Equation 11 can also be applied to absolute re- 
sponse rates in two-response concurrent schedules. 
The term r, then breaks down into r, + r,, where 1p 
is the reinforcement rate associated with the second 
response. ‘Thus: 


kr, 
cr a 


where & and r, are as already defined for Equation 11. 
Herrnstein (1970) demonstrated that this equation ac- 
counts for the absolute response rates obtained by 
Catania (1963b) for pigeons in a CQO-key concurrent 
procedure. The fit of the equation to the data is 
shown in Figure 10. The left-hand panel shows re- 
sponse rate on each schedule as a function of rein- 
forcement frequency for each when the total rate of 
reinforcement was held constant at 40 per hr. The 
right-hand panel shows response rate on each sched- 
ule when the reinforcement frequency on Schedule 2 
was held constant at 20 per hr and that on Schedule 1 
varied from 0 to 40 per hr. The straight line and two 
curves represent the plot of Equation 15 with k- and 
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Fig. 10. The absolute rate of responding on each schedule as a 
function of the rate of reinforcement for cach alternative in a 
CO-key concurrent VI VI schedule (Catania, 1963b), For the 
left panel, the overall frequency of reinforcement was held 
constant at 40 reinforcements per hr, while varying complemen- 
tarily for the two alternatives. Each point is here plotted above 
the reinforcement rate at which it was obtained. For the right 
panel, the frequency of reinforcement for Schedule 2 was kept 
constant at 20 reinforcements per hr, while varying between 
O and 40 per hr for Schedule 1. The poimts here are plotted 
above the reinforcement rate on Schedule 1 at the time they 


were obtaincd, The values of A and r, were used for the smooth 
curves in both panels. (Catania 1963b,) 


¥.-Values ag shown in the figure. Thé same parameter 
values were tised for the functions in beth panels. 
Only one data point deviates substantially from the 
plotted function, and the equation accounts for 91.0% 
of the variance in response rates in the left-hand panel 
and 90.5% of the variance in the right-hand panel. 

Equation 15 was also fitted to the méan absolute 
response rates on each key from Chung and Herrn- 
stern’s (1967) study of choice between different im- 
mediacies of reinforcement. For the pigeons with an 
S-sec delay on the standard key, the equation accounts 
for 90.5% of the variance in absolute response rate on 
the two keys. The k-value was 60 responses per min; 
r, was .002 (1/delay units). The equation was not fit- 
ted to the data for the two pigeons with a 16-sec 
standard delay since they showed a consistent bias 
toward the variable-delay key. These data therefore 
also provide substantial confirmation for Herrnstein’s 
system of equations. 

Equation 15 predicts matching in concurrent sched- 
ules, since the denominators and k-parameters cancel 
out when relative response rate and reinforcement fre- 
quency are calculated: 


kr, 
Ry | Ty + To + 1%, eon te 16 
R,+R, kr, kro tf. (16) 


Het ese Te Rtn tt 

If the asymptotic response rates (k) or the r,-values on 
the two schedules differ, however, the k-parameters or 
the denominators in Equation 16 will not cancel out, 
and matching will not be obtained. Herrnstein’s 
formulation of the relation between absolute response 
rates and reinforcement therefore predicts deviations 
from matching in many cases. For example, in the 
paced schedules of Staddon (1968) and Moffitt and 
Shimp (1971), the reinforced IRTs were of different 
durations and hence the responses would differ in 
asymptotic response rate for the same frequency of 
reinforcement. They may also differ in their 1,-values, 
since more collateral behavior will be generated in 
the one case than in the other. 

Similarly, Equation 15 might account for the devi- 
ation from matching with concurrent VI FI (Nevin, 
1971; ‘Trevitt et al., 1972) and FI FR (La Bounty & 
Reynolds, 1973) schedules. On FI or FR schedules, the 
value of r, probably changes with the differing prob- 
ability of reinforcement over time. While r, should be 
fairly constant over time on a VI schedule, its value 
on an FI or FR schedule is likely to be high immedi- 
atcly following reinforcement, in the postreinforce- 
ment pause, and low immediately prior to reinforce- 
ment. Equal *,-values for the two schedules in these 
procedures therefore cannot be assumed, as would be 
required for matching. If the differing mean r,-values 
for cach schedule could be ascertained, then the devi- 
ation from matching should be accounted for by 
Equation 15. However, until the application of Equa- 
tion 1] (Herrnstein’s equation for single schedules) is 
extended to FI and FR schedules, this explanation of 
the deviations from matching must remain specula- 
tive. 


Multiple Schedules. and Behavioral Contrast 


In Equation 15 two reinforcement schedules are 
assumed to exert their full effect on each response, 
since the schedules are simultaneously available to the 
subject. Numerous operant conditioning experiments, 
however, reveal interactions of a lesser magnitude be- 
tween successive reinforcement conditions and the re- 
sponse rates they maintain. 

Reynolds (1961a) first demonstrated that response 
rate in the first component of a multiple schedule de- 
pends not only on the frequency of reinforcement in 
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that component, but also on the reinforcement in the 
second component. He trained one group of pigeons 
on a multiple VI VI schedule and another group on a 
multiple VR VR, component duration for both groups 
being 3 min. In both cases, the rate of responding in 
the first component of the schedule increased con- 
siderably when reinforcement was discontinued in the 
second component, although the reinforcement fre- 
quency in the first component was unchanged. Reyn- 
olds called the effect “behavioral contrast.” Subsequent 
research has confirmed the phenomenon of behav- 
ioral contrast and extended it to different species 
even if the responses or the reinforcers in the two 
components are different in kind (Beninger, 1972; 
Premack, 1969). Reynolds (1961b, 1963) himself dem- 
onstrated that a change in the relative reinforcement 
rate in each component of the schedule is the major 
determinant of stable contrast. As long as reinforce- 
ment in the interacting component was sustained at 
the same level, contrast failed to occur, whether or not 
the pigeons responded in that component (Reynolds, 
1961b). Bloomfield (1967a) found that both positive 
(increased response rate) and negative (decreased re- 
sponse rate) contrast were determined by changes in 
the relative reinforcement frequencies. Response rate 
in the VI component of a multiple schedule varied in- 
versely with the frequency of reinforcement in the 
second component, whether that component was a 
DRL or an FR schedule. Only the relative reinforce- 
ment frequency and not differences in response rates 
or patterns of responding on the two component 
schedules had any effect on the direction or degree of 
behavioral contrast observed (Bloomfield, 1967a). Re- 
sponse rate in the presence of a given stimulus is 
therefore determined by the frequency of reinforce- 
ment during all of the stimuli that successively control 
the subject’s behavior (Reynolds, 1961b). 

These results agree qualitatively with Herrnstein’s 
(1970) general formulation of the law of effect; re- 
sponse rate is directly related to its relative reinforce- 
ment frequency. In addition, Herrnstein (1970) showed 
how a simple modification of Equation 15 could pro- 
vide an accurate quantitative account of many stable 
interactions between response rate and reinforcement 
in multiple schedules. Since only one schedule oper- 
ates in each component, interaction between multiple 
schedules is presumably less than that in concurrent 
schedules. ‘Therefore, Herrnstein inserted an interac- 
tive parameter into Equation 15, the equation describ- 
ing absolute response rate in concurrent schedules: 


kr 
= ea een (17) 


Ty + M1. + Te 


267 


‘The parameter m varies between 0 and 1.0 and repre- 
sents the degree of interaction between the two rein- 
forcement conditions. In concurrent schedules interac- 
tion is maximal, so m equals 1.0, and Equations 15 
and 17 are identical. Matching then follows as shown 
in Equation 16. The modified equation is therefore 
proposed to account for both matching in concurrent 
schedules and behavioral contrast in multiple sched- 
ules. 

Fquation 16 accurately describes the stable behav- 
ioral contrast reported in several experiments that in- 
vestigated a sufficient range of reinforcement frequen- 
cies or durations. For example, Lander and Irwin 
(1968) varied reinforcement frequency in a multiple 
VI 3-min VI x-min schedule and Nevin (1968) varied 
the duration of nonresponding required for reinforce- 
ment in the DRO component of a multiple VI 3-min 
DRO x. Rachlin and Baum (1969) manipulated the 
duration of reinforcement associated with one key 
from 1 to 16 sec while the VI schedule associated with 
a second key remained constant. Illumination of the 
key light signaled the availability of reinforcement on 
the key for which reinforcer duration varied; rein- 
forcement on the constant VI schedule was unsig- 
naled. Since the pigeons in Rachlin and Baum’s study 
only responded on the signaled key when the key light 
was illuminated—i.e., when reinforcement was avail- 
able—the conditions of stimulation, reinforcement, 
and responding on the two schedules were successive. 
Their procedure could therefore be considered a mul- 
tiple VI 3-min CRF schedule, even though two re- 
sponse keys were used. In all three of these studies, 
response rate on the unchanged schedule varied in- 
versely with the frequency or duration of reinforce- 
ment for the other schedule. With appropriate param- 
eter values selected, the averaged group data from 
each of the studies never deviates by more than 6 re- 
sponses per min from a perfect fit to Equation 17. For 
15 of 18 independent data points the deviation of data 
from theory is less than 3 responses per min (approx- 
imately 6%) (Herrnstein, 1970). 

The subjects in these experiments were pigeons 
and the reinforcer was access to food, but de Villiers 
(1972) extended Herrnstein’s equation for behavioral 
contrast to rats responding on multiple random-inter- 
val (RI) avoidance schedules. Lever-press responses 
canceled the delivery of shocks scheduled at random 
intervals, and the scheduled rate of shock was varied 
in the two components of the multiple schedule. 
When shock-frequency reduction (scheduled shock rate 
minus received shock rate) was substituted for 7, and 
rz in Equation 17 (Herrnstein, 1969; Herrnstein & 
Hineline, 1966), the equation provided an accurate 


268 CHOICE IN CONCURRENT SCHEDULES AND A QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


RATE 


RESPONSE 


RELATIVE 


0 2 & 6 .6 Lo 


RELATIVE 
REINFORCEMENT RATE 


Fig. 11. The relative rate of responding as a function of the 
relative frequency of reinforcement in one component of a 
multiple VI VI schedule, for each of three pigeons. The 
smooth curves plot Equation 15, with r,=0 and m set to the 
values indicated. (From Reynolds, 1963.) 


quantitative description of the long-term positive and 
negative contrast effects obtained in this study. ‘The 
values of the &- and r,-parameters calculated for each 
rat from the behavioral contrast data even predicted 
subsequent response rates on single RI avoidance 


schedules when inserted into Equation 11, Herrn- 
stein’s equation for single schedules. ‘The deviation of 
observed from predicted response rate on the single 
schedules averaged only 5.5% of the observed response 
rate, and was never greater than 3 responses per min. 

Equation 17 also makes predictions about relative 
response rates on multiple schedules. If m < 1.0, the 
predicted relation between relative response rate and 
relative reinforcement frequency is plotted by a fam- 
ily of curvilinear functions, the curvature of which 
depends on the magnitudes of 71, 75, r,, and m (Herrn- 
stein, 1970). An empirical test of this relation was 
provided by an experiment by Reynolds (1963) which 
systematically varied reinforcement rates in both 
components of a multiple VI VI schedule. Figure 1] 
shows the data from Reynolds’s experiment and the 
relative response rate curve predicted by Equation 17 
for the given parameter values. The value of +, was 
set to zero for these curves, but the r,-values typically 
obtained for pigeons on VI schedules would change 
the functions only minimally. Once again the theory 
provides a good fit to the data. 

At this point, however, two limitations on the ap- 
plication of Equation 17 must be noted: 


1. The equation not only assumes that a simple multi- 
plicative mechanism governs the degree of inter- 
action between the two reinforcement conditions, 
but it also assumes the most symmetrical of multi- 
ple schedules. It applies best to a schedule in which 
the response forms are the same in both compo- 
nents—i¢., in which they have equal asymptotic 
rates (4), in which the interaction is the same going 
from one component to the other in either direc- 
tion (7), and in which extraneous sources of rein- 
forcement are kept constant in the two components 
(r,). Yet behavioral contrast occurs in multiple 
schedules in which many or all of these symmetries 
are violated (Premack, 1969). Such asymmetries 
would considerably complicate a quantitative anal- 
ysis of the contrast effects, but it need not change 
the framework of the analysis—i.e., a framework 
which stresses relative reinforcement frequency (or 
magnitude) as a determinant of response strength. 


2. Second, the equation is not intended to account for 


short-term contrast effects in multiple schedules 
such as those discussed by Terrace (1966a, 1966b, 
1972) or even more transient changes in response 
rates reported by Nevin and Shettleworth (1966) 
and Bernheim and Williams (1967). Nevertheless, 
Herrnstein (1970) has indicated how Equation 17 
could be easily modified to handle the contrast 
effects described by Terrace (1966a) by assuming 
that an extinction component is aversive (Terrace, 
1966b) and adds a negative value to the denom- 
inator of the equation. 
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Matching in Multiple Schedules 


Nevertheless, within the situations to which it di- 
rectly applies, Equation 17 makes a number of inter- 
esting predictions. It not only predicts matching in 
concurrent schedules, but also predicts matching in 
multiple schedules under particular conditions. 

First, if the reinforcement frequency in one com- 
ponent of the multiple schedule became extremely 
large relative to r, and the scheduled reinforcement 
in other components, the denominators of the equa- 
tions governing response rate in each of the compo- 
nents would approach equality. When the denomina- 
tors are equal, matching is predicted (Equation 16). 
Data reported by Nevin in 1974 (cited by Herrnstein, 
1970) test this prediction. Nevin studied pigeons in a 
three-component multiple schedule. Two of the com- 
ponents were conventional VI I-min and VI 3-min 
schedules, each lasting for 1 min, and the third com- 
ponent consisted of a 30-sec blackout of the chamber 
separating the two VI components. The independent 
variable was the frequency of noncontingent rein- 
forcement provided during the blackout. As predicted 
by Equation 17, response rates in the two VI compo- 
nents decreased steadily as reinforcement frequency 
in the blackout increased from zero to 360 per hr 
(Herrnstein, 1970). Furthermore, relative response 
rates in the VI components approached matching: .75 
for the VI l-min and .25 for the VI 3-min. Averaged 
over the four pigeons, relative response rate in the VI 
l-min component went from .54 when there was no 
reinforcement in the blackout to .72 when the black- 
out reinforcement frequency was 360 per hr—i.e., 6 
times the combined rate of reinforcement in the VI 
components. ‘The denominators in Equation 17 for 
the two VI schedules therefore became more and more 
determined by the frequency of reinforcement in the 
blackout. 

Second, similarly, if the values of r,; and r, became 
extremely small relative to r,, reinforcement from ex- 
traneous sources in the two components, the denom- 
inators in Equation 17 for each component would be 
determined by r,. As the denominators approach 
equality, relative response rates should also approach 
matching, 

Herrnstein and Loveland (1974) manipulated the 
relative importance of ry and ry. by varying their 
pigeons’ body weights. ‘They argued that it is unneces- 
sary to know the specific reinforcers that constitute r, 
in order to conclude that r, should become a larger 
fraction of the denominator as the pigeons get less 
and less hungry. Whatever reinforcers 7, contains, 
they should not covary with hunger the way 7, and rz 
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must. Equation 17 therefore predicts that a subject’s 
performance in a multiple schedule should approach 
matching as the motivation for the scheduled rein- 
forcer declines. 

‘The basic procedure for the entire experiment was 
a multiple VI l-min VI 4-min schedule with 2-min 
components. The body weight of the pigeons was 
systematically varied from 80 to 110% of their ad 
libitum weight at the beginning of the experiment. 
Finally, the pigeons responded on the multiple sched- 
ule with a cupful of food in the experimental cham- 
ber. For all five pigeons, absolute response rates de- 
clined steadily with increasing body weight. At the 
same time, relative response rate in each component 
asymptotically approached the matching value (.80 
and .20, respectively). ‘Two pigeons matched at 100% 
body weight and three pigeons matched at 110% body 
weight. With free food in the chamber, response rates 
dropped to low levels, between 5 and 15 responses per 
min; but the relative response rates for all five pigeons 
closely approximated matching, 

Finally, Equation 17 predicts matching in multiple 
schedules when m approaches 1.0—i.e., when the in- 
teraction between the two components is maximal, as 
in concurrent schedules. The more the two reinforce- 
ment conditions are separated in time by long com- 
ponent durations, the less interaction there is likely 
to be between them, the smaller the contrast effect, 
and the greater the deviation from matching. On the 
other hand, the more the multiple schedule approx- 
imates a CO-key concurrent schedule with rapid al- 
ternations between the two reinforcement conditions, 
the larger the contrast effects and the closer the ap- 
proximation to matching predicted by Equation 17. 

Experiments by Shimp and Wheatley (1971) and 
Todorov (1972) provide a direct test of this predic- 
tion. They varied component duration in a multiple 
VI VI schedule with asymmetrical reinforcement fre- 
quencies. As component duration shortened, relative 
response rate increased for the richer VI and decreased 
for the other schedule. At the longer component dura- 
tions (>10 sec), relative response rate undermatched 
relative reinforcement frequency on the richer VI, but 
with very brief components of 5 to 10 sec the relative 
response rate reached its maximal value and matching 
was observed. With 5-sec components, Shimp and 
Wheatley obtained matching to a wide range of rela- 
tive reinforcement frequencies in the two components. 

In a similar procedure, but with rats as subjects 
and multiple VI avoidance schedules, de Villiers 
(1974) demonstrated the same relation between rela- 
tive response rate and relative frequency of negative 
reinforcement (shock-frequency reduction) as compo- 
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Fig. 12. The relation between relative response rate and relative 
reinforcement frequency for three pigeons responding on multi- 
ple VI VI food schedules with 5-sec components (Shimp & 
Wheatley, 1971) (left-hand panel); and for three rats responding 
on multiple VI VI avoidance schedules with 40-sec components 
(right-hand panel), The diagonal lines represent matching be- 
tween the relative frequencies, (de Villiers, 1974.) 


nent duration varied. At the longer component dura- 
tions, relative response rate undermatched the relative 
shock-frequency reduction; but as component dura- 
ta¢n shortened, the relative response rate increased 
until it was maximal at 40-sec components. At this 
component duration matching was obtained for three 
different relative shock-frequency reduction values. 
Figure 12 summarizes the matching data from Shimp 
and Wheatley (1971) for positive reinforcement and 
from de Villiers (1974) for negative reinforcement. 


Contrary Evidanea 


Certain limitations of Herrnstein’s equation for 
multiple schedules have already been discussed. In 
addition, recent éxpériments indicate that several of 
its predictions arg not supported by the data, 

As Spealman and Gollub (1974) pointed out, Equa- 
tion 17 predicts that the higher the reinforcement rate 
in an equal-valued multiple VI VI schedule, the 
larger the behavioral contrast found in the unchanged 
component when the second component is changed to 
extinction, When reinforcement frequencies in the 
two components are equal, 7 = 7, and Equation 17 
can be rewritten as 


kr, 
R, =——+_ (18) 


ry +m 4+ 1, 


If reinforcement is no longer scheduled in the second 
component, response rate in the first component is 
governed by the following equation: 


kr, 
hy’ = (19) 


"y + Ye 


where R,’ is response rate after behavioral contrast. 
The magnitude of behavioral contrast (percentage 
increase in responding in the unchanged component) 
is then calculated as follows: 


kr, kr, 
oa Ty ij ea x 100 
R, kr, 
m1 +mr,4+Y7, 
mr; 
= yy X 100 (20) 


According to this equation, the magnitude of behav- 
ioral contrast increases as 1, increases, assuming that 
m and r, are greater than zero and remain constant 
across the experimental conditions. 

Spealman and Gollub (1974) tested this prediction 
with eight pigeons, Four of the pigcons responded on 
a multiple VI 30-sec VI 30-sec schedule, the other four 
on a multiple VI 180-sec VI 180-sec schedule. The 
components of the multiple schedules alternated every 
180 seconds. ‘The second-component schedule for each 
group of pigeons was then changed to extinction. 
Contrary to the prediction of Herrnstein’s equation, 
the magnitude of behavioral contrast for the group 
with less frequent reinforcement (VI 180-sec) was 
greater than the contrast shown by all but one of the 
pigeons on the VI 30-sec. However, a critical assump- 
tion underlying the experiment is that the mean val- 
ues of m and r, do not differ across the two groups of 
pigeons, and this assumption may not be justified. It 
is plausible to suggest that extrancous sources of rein- 
forcement (r,) may be larger with less frequent con- 
tingent reinforcement and that the interaction be- 
tween components (m) may be greater with more 
frequent reinforcement. But both of these effects 
would work in favor of the prediction by Equation 17 
—greater behavioral contrast with more frequent rein- 
forcement—and against the observed results. 

Another prediction based on Equation 17 is that 
response rate on a multiple VI VI schedule will be 
lower than that on one of the VI schedules in isola- 
tion. Parameter R,’ in Equation 19 represents response 
rate on a single VI schedule (cf. Equation 11), and 
Ry’ > R, in Equation 18 if m > 0. Herrnstein (1970) 
cited unpublished data from Terrace showing that 
pigeons’ response rates on a single VI 60-sec schedule 
were higher than their response rates on a multiple VI 
60-sec VI 60-sec schedule. De Villiers (1972) also found 
higher response rates on a single VI 15-sec avoidance 
schedule than in either component of a multiple VI 
15-sec VI 15-sec avoidance schedule with rats. How- 
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Table 3. Mean absolute response rate in each component of the multiple schedule as a 
function of component duration, along with mean relative response rates and 
relative reinforcement frequencies for the component with more frequent rein- 
forcement, for the studies of Shimp and Wheatley (1971) and Todorov (1972) 

SHIMP AND WHEATLEY (1971) 
Component Relative Relative 
Duration Responses /min Response Reinforcement 
(tn sec) VI I-min VI 4-min Rate in VI I-min Rate in VI I-min 
180 53.1 25.4 .68 81 
60 48.8 19.8 1 80 
30 52.7 19.8 13 19 
10 67.5 23.9 74 80 
5 83.2 19.7 81 .80 
2 90.1 27.1 17 81 
TopoROV (1972) 
Component Relative Relative 
Duration Responses/min Response Reinforcement 
(in sec) VI 30-sec VI 90-sec Rate in VI 30-sec Rate in VI 30-sec 
$00 47.3 81.9 60 74 
150 52.7 $2.1 62 75 
40 52.9 26.8 .66 75 
10 DTZ 23.6 71 15 
5 58.4 27.0 .68 AD 


ever, In a second experiment Spealman and Gollub 
(1974) found that neither the pigeons on the VI 30-sec 
nor the pigeons on the VI 180-sec schedules responded 
faster on those schedules in isolation than on multiple 
schedules with the same VIs as components. 

One possible explanation of Spealman and Gol- 
lub’s results is that m had gone to zero by this stage of 
their study. The single VI schedule was the fourth 
experimental condition, preceded and followed by a 
multiple VI VI schedule condition. Since the pigeons 
were run for long periods of time in each condition, as 
long as 70 30-min sessions, and the components of the 
multiple schedule were signaled by red versus green 
key lights throughout, m may well have approached 
zero by the end of the experiment. Equal rates of re- 
sponding would then be expected for the single and 
multiple VI schedules. But if m = 0 in the last multi- 
ple-schedule condition of the experiment, response 
rate in that condition should be higher than that in 
the first multiple-schedule condition, where m was 
clearly greater than 0 (since substantial behavioral 
contrast was obtained). For only three of the seven 
pigeons run in the last multiple-schedule condition is 
that the case; for another three pigeons response rates 
are lower. Differences in component durations—180 
sec in Spealman and Gollub’s study versus 90 sec in 
Terrace’s—might also explain the discrepancy in re- 


sults, since m should be smaller with longer compo- 
nents; yet Spealman and Gollub obtained sizable 
contrast effects. 

Finally, there is strong evidence that the process by 
which matching develops in multiple schedules with 
short components is different from that predicted by 
Equation 17. Provided that k and r, do not change, 
the equation predicts that as component duration 
shortens and m approaches 1.0, response rates in both 
components should decrease. Lowest response rates 
should occur at the component duration where inter- 
action is greatest and matching is obtained. Table 3 
summarizes the average mean response rates in each 
component of the multiple schedule for the pigeons 
studied by Shimp and Wheatley (1971) and ‘Todorov 
(1972). As component duration decreased, so did re- 
sponse rate in the component with less frequent rein- 
forcement, until it reached its lowest value at the 
component duration where approximate matching 
was found. In contrast, response rate in the compo- 
nent with more frequent reinforcement increased with 
decreasing component duration, and was highest at 
the shortest duration. De Villiers (1972) found no 
clear relation between absolute response rates and 
component duration in his experiment on multiple 
VI avoidance schedules, but his results were affected 
by prolonged adaptation of the animals to the electric 
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shock. This analysis and Spealman and Gollub’s re- 
sults therefore severely question the adequacy of 
Equation 17 as an account of the interactions between 
response rate and reinforcement frequency in multi- 
ple schedules. 


AN ALTERNATIVE THEORY OF MATCHING 
AND BEHAVIORAL CONTRAST 


In a recent paper, Rachlin (1973) raised two ques- 
tions about the adequacy of Equation 17 in account- 
ing for both behavioral contrast and matching. He 
argued first that the equation implies that the same 
process underlies matching in both multiple and con- 
current schedules. Yet Killeen (19729) found clear 
dilterences between the patterns of behavior that lead 
to matching in the two procedures. 

In one experiment, Killeen divided his subjects 
into yoked pairs. Each pair contained a leader pigeon 
that was exposed to a CO-key concurrent schedule. 
Follower pigeons were yoked to the leaders so that 
the schedule in operation on the main key for these 
pigeons was determined by the changcovers of the 
Ieaders. Whenever a reinforcement was programmed 
for a response by the leader, it was also available for 
the follower, Killcen observed that leader and fol 
lower pigeons produced very similar patterns of re- 
sponding, although the followers were on a true 
multiple schedule and had ne contrel over alternation 
between the schedules. The absolute response rates 
were a little higher for the leader birds, but both 
groups of pigeons matched relative overall response 
rates t6 vélativé reinforcement frequency. ‘The leader 
pigeons also matched the time they allocated to each 
schedule to the distribution of the reinforcements. 
The focal rates of reinforcement in the concurrent 
and yoked conditions were therefore constant across 
the two component schedules. Killeen noted that the 
local response rates were also equal in both condi- 
tions so that relative local response rate matched 
relative local reinforcement frequency at .50. He con- 
cluded that in concurrent schedules the behavior ob- 
served is not so much response matching as it is 
equalizing. The pigeon adjusts the time spent in each 
component schedule so that the local reinforcement 
rates are equal. Overall relative response rate match- 
ing therefore results from the more basic process of 
time allocation. If this equalizing process is performed 
for, rather than by, the subject, as it was for the yoked 
pigeons, equal local response rates and overall re- 
sponse matching will be found even in multiple sched- 
ules. 


In a second experiment, however, Killeen (1972a) 
showed that the response matching found in multiple 
VI schedules with brief components (Shimp & Wheat- 
ley, 1971; Todorov, 1972) is different in kind from 
that found in concurrent VI schedules. If component 
durations are fixed at equal values while reinforce- 
ments are programmed by unequal VI schedules in 
the two components, the subject can no longer equal- 
ize local reinforcement frequencies. The distribution 
of time to the two schedules is fixed by the experi- 
menter so the local reinforcement rates approximate 
those set by the VI schedules in operation in each 
component. Local response rates therefore deviate 
from equality in the two components. With brief 
components, response rate-reinforcement interaction 
ig maximal and the relative local response rate matches 
the relative local frequency of reinforcement provided 
by the two VI schedules, 

From Killeen’s results, Rachlin (1973) concluded 
that response matching in concurrent schedules re- 
sults from the subject’s distribution of time. Extra 
responses on the preferred schedule that lead to 
matching come from the extra time spent responding 
there, not from a higher local response rate. On the 
other hand, matching and behavioral contrast in mul- 
tiple schedules are indicative of an increased response 
output when the relative reinforcement frequency is 
increased. Equation 17 fails to distinguish between 
these two processes. 

Second, Rachlin argued that the equation does not 
account for both inhibitory and excitatory effects of 
réinforcément on responding. Rachlin and Baum 
(1972) found that free reinforcement scheduled at 
variable intervals inhibited key pecking maintained 
by a constant frequency of contingent reinforcement, 
in accordance with Herrnstein’s equations. In sharp 
contrast with these results, several recent experiments 
have demonstrated excitatory effects of free reinforce- 
ment on a pigeon’s key pecking. 

For example, Staddon and Simmelhag (1971) pre- 
scntcd food reinforcement to a pigeon at fixed time 
intervals, irrespective of the bird’s behavior. They 
observed that after exposure to this procedure for a 
short period of time, pecking (at either the key or the 
walls of the chamber) began to predominate just prior 
to food presentation. Staddon (1972) argues that stim- 
uli signaling the imminent presentation of food elicit 
pecking (as in autoshaping procedures—Brown & 


Jenkins, 1968; Gamzu & Williams, 1971), even if peck- 


ing has no effect on food presentation. Thus a base 
line response of key pecking is facilitated by a brief 
light stimulus signaling free food (LoLordo, 197 1). 
Herrnstein and Loveland (1972) and Williams and 
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Williams (1969) report that a pigeon will peck at a 
brief stimulus signaling food even if the peck avoids 
or postpones the food presentation. There appears to 
be a special relation between pecking and food for a 
pigeon such that pecking becomes the dominant re- 
sponse in the bird’s repertoire and tends to be emit- 
ted when food is imminent. Other responses that can 
be maintained by food reinforcement, such as treadle 
pushing, do not show the same special linkage to food 
for pigeons. LoLordo (cited by Staddon, 1972) reports 
that a base line of treadle pushing is suppressed by a 
light stimulus for free food because the pigeons peck 
at the signal key. Other research suggests that the 
special relation between pecking and food is species- 
specific for pigeons and that other responses may be 
linked to food reinforcement for other species (Bre- 
land & Breland, 1961). 

Staddon and Simmelhag (1971) therefore define a 
class of “terminal responses” that bear a special rela- 
tion to a particular reinforcer. These responses are 
excited by the mere presence of the reinforcer or by 
stimuli that signal its imminent presentation, and do 
not need to be selected for by the reinforcement con- 
tingencies. Different terminal responses may be linked 
to different reinforcers for a particular species. On the 
other hand, “interim responses’’ are not directly re- 
lated to the reinforcer and tend to be emitted during 
periods of low reinforcement probability. 

An experiment by Gamzu and Schwartz (1973) first 
raised the question of the relation between response 
elicitation in the presence of reinforcements and be- 
havioral contrast. They presented free reinforcement 
to pigeons in both components of a multiple schedule. 
A color change of the key light signaled each compo- 
nent, but pecking the key had no programmed conse- 
quences. When the rate of free reinforcement was 
equal in the two components, no key pecking oc- 
curred. But when the rate of reinforcement in one 
component was reduced, the pigeons began to peck 
the key in the other component. Frequency of rein- 
forcement in that component was unchanged, but it 
was now associated with a higher relative reinforce- 
ment rate. Gamzu and Schwartz observed that the 
number of responses made in the unchanged compo- 
nent approximated the increase in responding found 
in normal contrast experiments using similar rates of 
contingent reinforcement. They therefore suggested 
that normal positive behavioral contrast consists of 
the instrumental responding appropriate to the sched- 
ule in the unchanged component, plus the extra pecks 
elicited by the stimulus signaling a higher relative 
rate of reinforcement in that component. If a stimulus 
associated with a high frequency or probability of re- 
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inforcement alternates with a signal for low reinforce- 
ment rates, it elicits pecking in pigeons in both mullti- 
ple schedules and autoshaping procedures. 

Subsequent experiments have determined some of 
the conditions in which autoshaped pecking is pro- 
duced in the Gamzu and Schwartz procedure. Redford 
and Perkins (1974) found that pigeons produced sub- 
stantial rates of key pecking in the component as- 
sociated with the higher relative frequency of free 
reinforcement only if the color stimuli signaling the 
two components were localized on the key. Substantial 
positive contrast was also found with pigeons pecking 
on a conventional multiple VI VI schedule when the 
schedule changed to multiple VI EXT (extinction) 
only if the component stimuli were localized on the 
key. When the components of the multiple schedule 
in both procedures were signaled by a change in the 
color of the houselight, none of Redford and Perkins’s 
pigeons showed positive contrast or autoshaped peck- 
ing. ‘This result is in keeping with an autoshaping 
account of positive contrast, since autoshaping of key 
pecking in standard procedures can only be obtained 
when the prefood stimulus is located on the response 
key (Wasserman, 1973). However, it should be noted 
that Redford and Perkins failed to get differential re- 
sponding in the VI and EXT components of the 
multiple schedule when these were signaled by a 
change in the color of the houselight. They ran only 
five multiple VI EXT sessions, and those alternated 
with VI VI sessions. If their pigeons failed to dis- 
criminate the change in reinforcement conditions in 
the EXT component, behavioral contrast would not 
be expected to occur in the other component. 

Schwartz (1973) demonstrated that key pecking is 
not elicited in the Gamzu and Schwartz procedure if 
the schedule components are signaled by the presence 
and absence of an auditory stimulus. Once key peck- 
ing was established with a change in key color sig- 
naling the components, however, it could be trans- 
ferred from the control of the visual stimulus to a 
tone, although it was maintained at a somewhat re- 
duced level. Subsequently, the pecking could be reini- 
tiated by the tone signal after an extinction procedure. 
Schwartz suggests that for pigeons preexperimental 
relations exist among food, visual stimuli, and peck- 
ing, but not for auditory stimuli. Thus a pigeon’s key 
peck cannot be autoshaped with an auditory prefood 
stimulus. 

The results of two experiments by Keller (1974— 
cited by Rachlin, 1973) also support an autoshaping 
analysis of positive contrast. Pecking on one key (in- 
strumental key) in the experimental chamber pro- 
duced reinforcement according to a multiple VI 30-sec 
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EX'T schedule. The stimuli associated with each com- 
ponent of the schedule, however, were presented on a 
second key (signal key). Pecking at either key in the 
VI component produced a feedback click; pecking in 
the EXT component produced no feedback. After 
training on the multiple VI 30-sec EXT schedule, the 
pigeons were run on a multiple VI 30-sec VI 30-sec 
schedule until responding was stable in both compo- 
nents. When reinforcement in the second component 
was again discontinued, only one of the three pigeons 
showed temporary behavioral contrast on the instru- 
mental key in the VI 30-sec component; the other two 
pigeons showed no contrast on the instrumental key 
but began to peck the signal key in the unchanged 
component. 

In Keller’s second experiment there were three 
components in the multiple schedule, each associated 
with a different color key light on the signal key. The 
pigeons were trained on a multiple VI 1-min VI 1J-min 
EXT schedule, and then the component schedules 
were varied, They were cither a VI 1l-min or EXT. 
None of the pigeons showed behavioral contrast on 
the instrumental key during a VI component when 
onc or two of the other components was EXT: all 
three pigeons pecked at a substantial rate on the sig- 
ial key during the VI components but not during the 
EXT components. Signal key response rate was high- 
est when only one component was a VI. Rachlin 
(1973) concludes that if the signal and instrumental 
pecks were superimposed on a single key, as in normal 
multiple schedules, positive behavioral contrast would 
have been observed. 

In view of these results, Rachlin (19738) suggests 
that whéreas response matching in concurrent sched- 
ules is a by-product of a basic process of time alloca- 
tion, matching and contrast in multiple schedules 
result from more direct excitatory and inhibitory 
effects of relative reinforcement frequency on respond- 
ing. A stimulus associated with a higher relative rein- 
forcement frequency elicits extra pecks, while a stim- 
ulus associated with a lower relative frequency inhibits 
pecking. In more general terms, Rachlin argues that 
“terminal” (Staddon & Simmelhag, 1971) or auto- 
shaped responses are excited during periods of high 
relative reinforcement frequency, while “interim” re- 
sponses are inhibited. During periods of low relative 
reinforcement frequency the opposite process occurs: 
terminal responses are inhibited and interim responses 
excited. Hence both positive (excitatory) and negative 
(inhibitory) contrast effects occur. Matching is ob- 
served in both multiple and concurrent schedules be- 
cause the two different processes that produce it are 
reactions to the same independent variable—the rela- 


tive value (in most cases the relative reinforcement 
frequency) of the component schedules. 

To the extent that Rachlin emphasizes relative 
reinforcement frequency as the major determinant of 
matching and contrast, his analysis is similar to Herrn- 
stein’s; but Rachlin stresses the differences between 
the underlying response rate and reinforcement inter- 
actions in multiple and concurrent schedules. His 
account of matching in multiple schedules is sup- 
ported by the analysis of the response rate data from 
Shimp and Wheatley (1971) and Todorov (1972) given 
on p. 271, while Herrnstein’s account is not. As the 
interaction between the components increases, Rach- 
lin suggests that the pigeon’s terminal response for 
food—pecking—should be facilitated in the period of 
high relative reinforcement frequency and more and 
more inhibited in the period of low relative reinforce- 
ment frequency, In fact, peck rate increased in the 
component with higher reinforcement frequency and 
decreased in the other component, with decreasing 
component duration in both studies. 

However, Rachlin’s theory of the processes produc- 
ing the two kinds of response matching is itself inade- 
quate in several respects. Response matching not only 
occurs in concurrent VI VI schedules where local re- 
sponse rates are cqual in cach schedule. It is found in 
concurrent VI VR schedules in which the local rates 
differ considerably on the two schedules and the 
subject cannot equalize both local response rates and 
reinforcement frequency (Herrnstein, 1970, 1971; 
Herrnstein & Loveland, unpublished data). It is also 
obtained in discrete-trial concurrent schedules where 
time allocation applies only arbitrarily (Nevin, 1969; 
Herrnstein, unpublished data), Hence more direct re- 
sponse matching occurs in concurrent schedules than 
Rachlin’s theory implies. 

Furthermore, the generality of Rachlin’s explana- 
tion of behavioral contrast in multiple schedules is 
questionable. The results of an unpublished experi- 
ment by Bouzas question the generality of Keller’s 
findings. Bouzas trained six pigeons on a multiple VI 
I-min VI I-min schedule with 30-sec components dur- 
ing which the instrumental key in the chamber was 
always white. In one experimental condition the com- 
ponents were signaled by a color change on a second 
(signal) key (i.e., Keller’s procedure); in a second con- 
dition the components were signaled by the presence 
or absence of a tone. The six pigeons were studied in 
both conditions, three receiving Keller’s procedure 
first and three receiving the tone condition first. 
When Bouzas discontinued the reinforcement in the 
second component, five of the six pigeons showed posi- 
tive contrast in the first component in the tone condi- 
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tion. In the Keller procedure, four of the six pigeons 
showed positive contrast on the instrumental key in 
the absence of any pecking at the signal key. A fifth 
pigeon did peck at the signal key, but did not show 
contrast even when those extra pecks were added on 
the instrumental key. Bouzas then studied three of the 
pigeons in the Keller procedure with shorter compo- 
nent durations. ‘The birds were trained on a multiple 
VI Il-min VI Jl-min schedule with 10-sec components, 
and then switched to a VI l-min VI 3-min. None of 
the pigeons showed contrast on the instrumental key 
in this procedure; but two of the three birds pecked 
the signal key a substantial amount during the VI 
I-min component. Adding these extra pecks to re- 
sponse rate on the instrumental key produced posi- 
tive contrast for these two birds. But the difference in 
results between the studies of Keller and Bouzas can- 
not be explained by component durations; Keller 
used 1-min and 2-min components, longer than either 
of the values used by Bouzas. Frequency of reinforce- 
ment could be a factor, as only one of Keller’s three 
pigeons in his first experiment pecked at the signal 
key when he changed from a multiple VI l-min VI 
I-min to a multiple VI Il-min EXT schedule. Bouzas 
used VI I-min schedules. On the other hand, Keller 
obtained signal key pecking from all three pigeons in 
the three-component multiple VI VI EXT schedule 
in which VI 1-min components were used. 

The contrast effect found by Bouzas for the pigeons 
with the tone signal is also surprising in terms of an 
autoshaping analysis of contrast, since a pigeon’s key 
peck can only be autoshaped with visual and not audi- 
tory stimuli. Similarly, LoLordo et al. (1974) found 
that a brief stimulus signaling free food only facili- 
tated the base line key pecking if a visual signal was 
presented on the response key. An auditory prefood 
stimulus did not facilitate responding. But West- 
brook (1973) also obtained positive contrast for six 
pigeons when the components of the multiple sched- 
ule were signaled by white noise versus a tone. The 
contrast effects found with auditory stimuli signaling 
components seem to be smaller than those found with 
visual stimuli on the key, but they are not readily 
explained by the autoshaping theory of contrast. (For 
a detailed discussion of the strengths and weaknesses 
of an autoshaping account, including a more detailed 
review of Keller’s results, see chapter 3 by Schwartz 
and Gamzu in this volume.) 

In brief, a large body of data on stable behavioral 
contrast and matching in multiple schedules can be 
accounted for by.a particular view of response strength 
—namely, that the strength of a response is directly 
related to its relative frequency (or magnitude) of 
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reinforcement. But it has become clear from the 
studies discussed here that no one analysis of behav- 
ioral contrast can handle all of the phenomena usu- 
ally subsumed under that term. We must distinguish 
between short-term, transient effects that may be emo- 
tional in origin (Terrace, 1966b, 1972) and long-term, 
stable interactions between successive reinforcement 
conditions (Bloomfield, 1967b; Herrnstein, 1970). 
Other contrast effects seem to be related to excitatory 
effects of some reinforcers on particular responses 
(Gamzu & Schwartz, 1973; Staddon, 1972), and may 
even be species-specific. Indeed, there is no reason to 
believe that all of these contrast effects can be ex- 
plained by the same process; many different phenom- 
ena may be involved. 


DISCUSSION 


Preceding sections of this chapter have considered 
a wide range of choice procedures in which a simple 
matching relation holds between relative performance 
measures and relative measures of reinforcement. 
They have shown how a quantitative theory of re- 
sponse strength derived largely from the matching 
relation can account for a remarkable range of data 
on response rate—reinforcement interactions in single, 
multiple, and concurrent schedules. But they have 
also indicated that some choice procedures do not pro- 
duce the matching relation (Moffitt & Shimp, 1971; 
Nevin, 1971; Staddon, 1968; Trevitt et al., 1972; White 
& Davison, 1973). To what extent do these results 
weaken the matching relation as a quantitative law of 
choice in concurrent schedules? 

Rachlin (1971) has argued that despite these em- 
pirical data, the matching relation “is not an em- 
pirical law but a statement of assumptions made prior 
to empirical test’ (p. 249). It derives directly from 
our intuitions about unconstrained choice and as such 
it is not subject to disproof by empirical data. The 
theoretical matching law as formulated by Rachlin 
states that 
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where T represents the time allocated, R the number 
of responses made, and V the value of reinforcement 
consequent on each alternative; 7 is the reinforcement 
frequency, a the amount of reinforcement, 7 the im- 
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mediacy of reinforcement, and x any other parameters 
of reinforcement for the two alternatives. 

Rachlin believes that the matching law embodied 
in Equations 21 and 22 is a tautology or analytic 
statement about how reinforcing value should be 
measured. Wherever there are deviations from the 
simple matching relation (Equation 1), transforma- 
tions on the independent variable dictated by ob- 
tained time (or response) ratios can make the law fit 
the data. A difference between programmed and 
obtained reinforcement (Premack, 1965) or the ac- 
tions of unknown reinforcers (i.e., x; and x, in Equa- 
tion 22) can always be invoked to minimize the 
deviation from matching. In these ways the theoreti- 
cal matching law cannot be empirically disproved or 
supported. For Rachlin the value of the matching law 
lies in specifying the assumptions made in any choice 
experiment and circumscribing our search for hidden 
reinforcers so that matching will hold in choice situa- 
tions, 

A recent theoretical paper by Killeen (1972b), how- 
aver, raises several arguments against Rachlin’s ap- 
preach to the matching relation. Killeen argues first 
that the empirical matching relation (Equation 1) 
must be distinguished from Rachlin’s theoretical 
matching law (Equation 21). The empirical matching 
equations Clearly are not examples of a law “not sub- 
ject to empirical test,” since concurrent schedule 
experiments could well have found some other rela- 
tion between relative performance measures and rein- 
forcement. On the other hand, Rachlin’s matching 
law 18 indeéd a tautdlasy, sinicé it assiimeés that time 
ratios define value ratios. But Killeen suggests that 
this only generates a redundant intervening variable 
—i.¢., value. Since time ratios must equal value ratios, 
the former could be substituted for the latter when- 
ever they occur, and the notion of value adds nothing 
t6 Rachlin’s analysis. “The generality of Rachlin’s 
law is uninteresting since it can be obtained in any 
situation where one cares to postulate an intervening 
variable equal to the data in question” (Killeen, 1972b, 
psd. 

Rachlin asserts that Equation 21 is “derivable from 
the assumption that an organism choosing between 
alternatives is under no constraints except those the 
contingencies of reinforcement impose” (p. 249); it 
codifies our intuitions about choice. However, like 
Killeen (1972b), I do not see how this must follow, as 
many other relationships are possible. A far more 
complicated relation could hold between responding 
or time allocation and the reinforcement variables; 
why not T, — T, = Vy — V2 (see Lea & Morgan, 1972)? 
Moreover, it was the success of the empirical match- 


ing relation, not any intuitions about choice, that 
brought to the attention of researchers in operant 
conditioning the utility of the concurrent schedule 
procedure and the advantages of a relative measure of 
behavior in quantifying the law of effect. The value 
of an empirically based matching law lies in the ex- 
tent to which it makes valid predictions in new situa- 
tions, as well as in its utility for ordering data and 
pointing out anomalies. As Harré (1960) noted: “It is 
only when we begin to apply the law to cases further 
and further removed from the cases upon which it 
was based that we begin to run serious risks of a con- 
trary case appearing and disconfirming the law” (p. 
154). ‘Three possibilities are open to us when the in- 
evitable counterinstances to the matching law are 
found (Harré, 1960): 

One, we can uphold the law and note the contrary 
case. ““This will be the appropriate action when there 
are various doubts in the investigator’s mind about 
the control of extrancous factors in the experimental 
set up” (Harré, p. 154). T have argued that this is the 
appropriate action in the case of studies by Schmitt 
(1974) for humans and Trevitt et al. (1972) for pi- 
geons, and in the case of several experiments where 
undermatching probably resulted from a2 GOD dura- 
tion that was too short. 

Two, we may merely state the limitations on the 
law (eg. Boyle’s law does not hold at very high pres- 
sures). In the present chapter I have stated some 
limitations on the application of the matching rela- 
tion. Matching doés not hold in the absence of a COD 
or when the COD is too short. Kulli’s unpublished 
experiment showing matching in the absence of a 
GOD at low drive levels (p. 244) suggests that rate of 
alternation between the schedules, and not the COD 
procedure per se, is the crucial factor. If the rate of 
alternation is too high, responses to one alternative 
come under partial control of the other reinforcement 
schedule. 

White and Davison (1973) have suggested that the 
matching relation fails to hold when different patterns 
of responding are generated by the two schedules— 
e.g., on concurrent VI FI schedules (Nevin, 1971; 
Trevitt et al. 1972) or on concurrent paced schedules 
(Staddon, 1968; Moffitt & Shimp, 1971). In these cases 
the schedule contingencies tend to constrain response 
rates on the two schedules. When the patterns of re- 
sponding are more similar—e.g., on concurrent FI FR 


schedules—the deviation from matching is much 


smaller (LaBounty & Reynolds, 1973); and when the 
response patterns are the same on the two keys, even 
though responding is paced by the reinforcement con- 
tingencies (Shimp, 1971), matching is obtained. 
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An important goal of future research will be the 
determination of further limitations on the matching 
relation. It will then be necessary to explain why 
matching does not hold in particular situations (see p. 
266) and to assess the power of the matching law in 
terms of the range of situations for which it gives an 
accurate quantitative description of choice. 

Third, we may attempt to “formulate a new gen- 
eralization which will include all the results obtained 
under all sets of conditions that have been investi- 
gated” (Harré, p. 155). For example, Baum (1974a) 
generalized the matching equation to account for bias 
or inherent preference for one alternative. In this way 
he could account for data previously thought to con- 
tradict matching (e.g., Fantino et al., 1972). However, 
Baum’s proportional ratio matching (Equations 4 and 
5) does not handle “results obtained from all sets of 
conditions [of choice] that have been investigated.” 
More general mathematical formulations of choice in 
concurrent schedules have therefore been suggested 
to replace the simple matching relation. 

Killeen (1972b) proposed that an additive differ- 
ence model of utility (Tversky, 1969) be considered a 
“working hypothesis” of the way in which different 
parameters of reinforcement determine choice: 
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(23) 


where T, R, 7, a, i, and x are as specified for Equa- 
tions 21 and 22. This equation states that choice can 
be predicted by a particular concatenation of subjec- 
tive scales of reinforcement. In the context in which 
each scale is derived, Equation 23 is true by definition, 
since the scales are defined by the choice behavior. 
But the equation becomes empirically falsifiable when 
extrapolations are made to new choice situations or 
when several subjective scales are combined (Killeen, 
1972b). It also assumes that there are no interactions 
among the subjective scales, and this too is empiti- 
cally testable. 

Both the strength and weakness of Killeen’s formu- 
lation lie in its great generality. Current research in- 
dicates that a simple linear scaling of reinforcement 
produces matching for a number of different rein- 
forcement parameters. Then a far more parsimonious 
matching law than Equation 23 is possible. If the 
more general formulation embodied in Equation 23 
does prove necessary to describe choice in concurrent 
schedules, then the variables that affect each reinforce- 
ment function will need to be specified. For example, 
the function relating time ratios to reinforcement fre- 
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quencies seems to be different for choice between two 
VIs and choice between a VI and an FI schedule. 

Staddon (1968, 1972) has argued that a power func- 
tion relation between behavior and reinforcement 
ratios—i.e., a specific case of Killeen’s equation—ac- 
curately describes all of the present data on choice in 
concurrent schedules: 


T, R, 

Ts or Rs =k (24) 
where & is a parameter indicating any systematic bias 
toward one schedule. Equation 24 reduces to propor- 
tional ratio matching when the exponent is 1.0; but it 
also accounts for choice on paced schedules, concur- 
rent VI FI and concurrent FI FR schedules in which 
exponents of less than 1.0 were obtained. Staddon 
(1972) suggests that the power function formulation 
is required in situations where there is an intrinsic 
preference for one of the responses or schedules of 
reinforcement apart from the frequencies of rein- 
forcement associated with each. 

Although it handles all of the data discussed in 
this chapter, the implications of the power function 
formulation must be further specified before its value 
relative to the simpler matching relation can be as- 
sessed. For example, the factors that lead to differ- 
ences in the power function exponent have yet to be 
specified. In some situations in which subjects have a 
strong bias toward one or another of the reinforcers 
the exponent is near 1.0—i.e., the matching exponent 
(Hollard & Davison, 1971). Bias or intrinsic preference 
accounts for deviations in the intercept of the choice 
function from (0, 0), but not for differences in the 
exponent (Baum, 1974a). Furthermore, while Herrn- 
stein (1970) has extended the matching relation to 
account for absolute as well as relative response rates, 
the implications of a power function formulation for 
absolute response rates (except where it reduces to the 
matching relation) are unclear. 

Finally, as discussed earlier, Herrnstein’s extension 
of the matching relation to absolute response rates 
may be used to account for some deviations from 
matching. For example, if values of k for two differ- 
ent response forms were previously determined in 
single schedules, it should be possible to predict the 
deviation from matching that would be observed in a 
concurrent schedule using those two responses (i.e., 
provided that the r,-values were not too dissimilar for 
the two responses). There has been little quantitative 
investigation of choice between different response 
forms in concurrent schedules. 
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While there is thus general agreement about the 
empirical relation between response strength and rela- 
tive reinforcement, there is yet to be a consensus on 
the best mathematical form for expressing it. It is 
only as the implications and predictions of each formu- 
lation are specified and empirically tested that their 
relative merits can be evaluated. 


CONCLUSION 


In 1971, Werrnstein prefaced a paper entitled 
"Ouantitative Hedonism” with the statement; ‘The 
occasion for this article is the conviction that a pre- 
cise statémént of reimtforcément may be at hand, 
zrowing out of the discovery that frequencies of al- 
tarmative forms of behavior secur in the came propor- 
tion as the resulting reinforcements” (p, 400). Several 
precise quantitative statements of response rate— 
reinforcement interactions in concurrent, multiple, 
and single schedules have been discussed in this chap- 
tev. Although the chapter has concluded that the 
available formulations cannot yet be fully evaluated, 
it alco indicates that during the past decada significant 
progress has been made toward the goal of a quanti- 
fied law of effect encompassing both positive and 
negative reinforcement, 


APPENDIX A 


Least-squares fit ef Equatien 11 te the data frem 
studies of magnitude of reinforcement. The values of 
k and r. and the percentage of the variance in the 
dapendaent variabla aéeauntéed for by tha equation 


are shown. 


Crespr (1942): Running speed (100/time in sec) In a 
straight alley for five different weights of deg chow 
between .O9 and 5.19 8: Groups of o to al rats per value. 

k = 80 (running speed units) 

Tp ee of chow) 

99.507, 


ZEAMAN (1949): Speed (100/latency in sec) to leave the start 
box of a short runway for six different weights of cheese 
between .05 and 2.4 g. Groups of 8 to 10 rats per value. 


k = 130 (100/latency in sec) 
1, = .8°(2-0t cheese) 
89.17, 


Hutr (1954): Rate of lever pressing on a periodic-reinforce- 
ment Il-min schedule for three different weights of rat 
chow between 3 and 50 mg. Groups of 9 rats per value. 


k = 12.4 (responses/min) 
Y, = 10.1 (mg of chow) 
99.6% 
BEIER (1958): Running speed (100/time in sec) in a straight 
alley for three numbers of pellets between | and 13 
pellets. Groups of 6 rats per value. 


=— 68.0 (running speed units) 
r, = .67 (pellets) 
D947, 


LOCAN (1960): Running speed (100/time in sec) in a straight 
alley for four different numbers of food pellets between 
] and 12 pellets. Groups of 6 rats per reinforcement 
value and drive level. 


12 hr deprived 
k == 45,1 (running speed units) 
r, = .15 (pellets) 
90.0% 
48 hr deprived 
k= 51.0 (running speed units) 
n= 15 (pellets) 
4.997 


KEESEY & Kine (1961): Experiment I: Rate of key pecking 
on a VI 4-min schedule for four different numbers of 
peas between 1 and 4. Mean response rate of 4 pigeons, 
each exposed to all four values. 


k, = 118.6 (responses/min) 
r, = .16 (peas) 
12.2%, 


Experiment II: Rate of key pecking on a VI 4min 
schedule for three different numbers of hemp secds be- 
tween 2 and 8 seeds. Mean response rate of 3 pigeons, 
each exposed to all three values, 
k — 67.0 (responses /min) 
Y, = .45 (seeds) 
ns ee es 


Carania (1963a): Rate of key pecking on a VI 2-min sched- 


ulé for three different durations of access to grain be- 
tween 3 and 6 seeonds. Mean response rate of 38 pigeons, 
cach exposed to all three values. 
bh — 64.6 (résponsés/muin) 
r, = .02 (Sec of grain access) 


1.8% 
Dr Lotio (1964): Running speed (100 / tame in sec) ina 
straight alley for three different numbers of food pellets 
between | and 16 pellets. Groups of 16 rats per value. 


kh = 38.9 (running speed units) 
7 = LO qpeltets) 
9919, 


Davenport, Goopricu, & Haccuisr (1966): Rate of lever 
pressing on a VI I-min schedule for three different 
numbers of sucrose pellets between | and 9 pellets. Four 
macaque monkeys, each exposed to all 3 values. 

Mean response rate of the four monkeys: 
k = 21.7 (responses /min) 
1, = Lf Gpelets) 
100.0% 


Individual monkeys: 
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k (responses /min) 


279. 


% of the variance 


r, (pellets) accounted for 


eee 


S36 16.5 
S49 30.0 
S57 26.0 
S60 17.0 


2.0 99.9 
3.0 90.4 
1.4 90.1 
1.0 96.6 


nen ee 


APPENDIX B 


Least-squares fit of Equation 11 to the data from 
studies of the reinforcement magnitude of brain stim- 
ulation. The values of k and r. and the percentage of 
the variance in the dependent variable accounted for 
by the equation are shown. 


Kersey (1962): Rate of lever pressing on a VI 6-sec sched- 
ule for six different durations (.28 to 2.65 msec), inten- 
sities (.5 to 4.0 mA) and pulse frequencies (23 to 138 
pps) of hypothalamic brain stimulation. Mean response 
rates of 10 rats, each exposed to all six values of each 
parameter of brain stimulation. 

Duration: 
k= 15.1 (responses /min) 
Yr, = .65 (msec) 
Stl, 
Intensity: 
k— 20.4 (responses /min) 
Y, = 2.0 (mA) 
92.9%, 
k (running 
speed units) 


LH 22 107.5 
LH 30 111.5 
LH 35 94.2 
LH 54 115.2 
DBB 35 106.0 
DBB 38 110.5 
DBB 44 110.0 
PH 35 108.0 
PH 40 102.0 


Pulse Frequency 
k = 25.0 (responses /min) 
r, = 145 (pps) 
96.9%, 

KrEsEy (1964): Rate of lever pressing on a VI 16-sec sched- 
ule for five different durations (.25 to 2.0 sec) of poste- 
rior hypothalamic brain stimulation at two different 
intensities (3.0 and 1.5 mA). Mean response rates of 10 
rats, each exposed to all five durations at both intensities. 


3.0 mA 
k = 21.0 (responses /min) 
ees oe (sec) 
98.6%, 
1.5 mA 


hk = 14.9 (responses /min) 
¥, = 10 (cee) 
96.6% 


Garusrex (1969): Running speed (100/time in sec) in a 
straight alley for four to six different numbers of brain 
stimulation puses (between 4 and $84 pulses Per rein- 
forcement. Nine rats, each exposed to several different 
stimulation values. 


Individual rats: 


r, (pulses % of the variance 
of BS) accounted for 
1.2 76.0 
5.0 84.2 
2.5 96.5 
1.2 78.5 
3.8 96.2 
11.9 90.2 
4.0 92.9 
18.0 77.8 
12.0 95.4 
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Least-squares fit of Equation 11 to the data from 
studies of sucrose and glucose concentration. The val- 
ues of k and r, and the percentage of the variance in 
the dependent variable accounted for by the equation 
are shown. 


GUTTMAN (1953): Rate of lever pressing on a periodic-rein- 
forcement l-min schedule for four different concentra- 
tions of sucrose (4 to 32%). Mean response rates of 20 
rats, each exposed to all four values; and mean response 
rate of groups of 20 rats per value. 


Mean of rats: 
b= BS (responses /min) 
Y, = 4.2 (% sucrose) 
92.4%, 
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Groups of rats: 
k = 11.5 (responses /min) 
r, = 11.5 (% sucrose) 
95.6%, 


GuTTMAN (1954): Rate of lever pressing on a VI I-min 
schedule for seven different concentrations of sucrose 
and glucose (2 to 32%). Mean response rates of 8 rats, 
each exposed to all seven values of both reinforcers. 

Sucrose: 
k = 15.6 (responses /min) 
r, = 7.1 (% sucrose) 
93.7% 


Glucose: 
h — 16.1 (responses /min) 
ee 11.0 (%, glucose) 


1% 


CONRAD & SIDMAN (1956): Rate of lever pressing on a VI 
47-sec schedule for five different concentrations of sucrose 
(2-3 to 309,) at two drive levels (48 and 72 hr of depri- 
vation). Méan of 3 monikéys, éach exposed to all five 
concentrations at both drive levels. 

AQ hr deprivation: 


k = 16.3 (responses / min) 
Yr, — 3.7 (% Sucrose) 
96.2% 


72 hr deprivation: 
k = 17.6 (responses /min) 
7, — 1.7 (% sucrose) 
71.8% 


KRariine (1961): Running speed (100/time in sec) in a 
styaight alley for three different concentrations (2.4 to 
9.1%) and three different magnitudes 5 to 125 cc) of 
sucrose. Groups of 9 rats for each concentration and 
magnitude. 


QUANTITATIVE FORMULATION OF THE LAW OF EFFECT 


5 ce: 

k = 89.4 (running speed units) 
r, = 2.4 (% sucrose) 
100.0% 
25°C: 

k = 87.9 (running speed units) 
v= 2.2 (7% Sucrose) 
WIZ, 


125 cc: 
k = 88.7 (running speed units) 
Yr, = 1.4 (% sucrose) 
95.6% 


SCHRIER (1963): Rate of lever pressing on a VI 1-min sched- 
ule for five different concentrations of sucrose (10 to 
50%). Mean response rates of 4 monkeys, each exposed 
to all five concentrations. 

k = $7,5 (responses /min) 
r, = 6.1 (% sucrose) 


95.40%, 


Samrier (1965); Rate of lever pressing on a VI 30-sec sched- 
ule for five different concentrations (10 to 50%) and two 
different magnitudes (.33 and .83 cc) of sucrose. Re- 
sponse rates of 8 and 6 monkeys, cach exposed to all 
five concentrations, 6 of them exposed to both magni- 
tudes. 

Mean response rate of the 6 monkeys exposed to both 
magnitudes: 
33 cc: 
k = 88.0 (responses /min) 
r, = 16.1 (% sucrose) 
95.90%, 
.83 CC: 
k = 66.5 (responses /min) 
r, = 9.7 (% sucrose) 


96.35% 


Individual data from the 8 monkeys with .33 cc of sucrose: 


k (réesponseés / 


r. (% % of the variance 


min) sucrose) accounted for 
Ruth 109.1 17.0 94.5 
John 61.9 6.9 85.7 
Ken 91.0 21.5 98.6 
Allan 55.2 8 70.1 
Karen 87.5 9.5 95.2 
Joan 131.2 52.4 92.6 
Leo 105.9 40.1 98.6 
Mae 6.6 97.4 85.4 


Individual data from the 6 monkeys with .83 cc of sucrose: 


k (responses / 


min) 
Ruth 82.4 
John 77.2 
Ken 86.5 
Allan 61.9 
Karen 68.2 


Joan 45.8 


rod, % of the variance 
SUCTOSE) accounted for 

5.5 84.9 
6.0 97.0 

20.0 97.7 
TD 90.8 

hie2 93.3 

65.0 98.3 
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Least-squares fit of Equation 11 to the data from stud- 
ies of immediacy of positive reinforcement (1/delay 
in sec). The values of k and r, and the percentage of 
the variance in the dependent variables accounted 
for by the equation are shown. 


PERIN (1943): Speed of responding (100/latency in sec to 
Jever press) in a discrete-trial situation with three differ- 
ent immediacies of reinforcement (2- to 10-sec delay). 
Groups of 25 rats per delay value. 

k = 54.0 (100/latency in sec) 
r, = .29 (1/delay in sec) 
93:27, 

Locan (1960, Experiment 55D): Running speed (100/time 
in s€c) in a straight alley for five different immediacies of 
reinforcement (l- to 30-sec delay). Groups of 10 rats 
per delay value, each run at high and low drive. 

High drive: 
k = 64.9 (running speed units) 
r, = .016 (1/delay in sec) 
rs be re a A 
Low drive: 
k = 64.5 (running speed units) 
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r, = .075 (1/delay in sec) 
92.897, 

SILVER & PIERCE (1969): Rate of lever pressing on a VI 1- 
min schedule with five different immediacies of rein- 
forcement (10- to 160-sec delay). Mean of 6 rats, each 
exposed to all five delay values. 

k = 9.8 (responses /min) 
r, = .02 (1/delay in sec) 
95.0% 

PIERCE, HANFoRD, & ZIMMERMAN (1972): Rate of lever press- 
ing on a VI |-min schedule with five different immedi- 
acies of reinforcement (.5- to 100-sec delay), the delay 
of reinforcement being signaled by a cue light. Also 
three immediacies of reinforcement (10- to 100-sec delay) 
with the lever retracted during the delay. Mean response 
rates of 4 rats, each exposed to all of the delay values in 
both delay conditions: 


Cue light: 

k = 21.4 (responses /min) 

r, = .04 (1/delay in sec) 
96.1%, 

Lever retracted: 
k — 106.0 (running speed units) 

Yr, = .09 (1/deay in sec) 

95.0% 


Individual data from the cue-light condition: 


K (responses / r, (delay % of the variance 
min) im SEC) accounted for 
RI 21.5 : .08 80.8 
R2 23.5 15 98.3 
R3 22.8 02 78.9 
R4 20.9 07 97.4 


Individual data from the retracted-lever condition: 


k (responses | 


min) 
Rl 219.1 
R2 67.5 
R3 102.3 
R4 26.6 
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Least-squares fit of Equation 11 to the data from stud- 
ies of magnitude of negative reinforcement. The val- 
ues of k and r, and the percentage of the variance in 
the dependent variable accounted for by the equation 
are shown. 


CAMPBELL & KRAELING (1953): Running speed (100/time in 
sec) in a straight alley for four different reductions in 
voltage (100 to 400 V) with a 400-V alley, and for three 
different voltage reductions (100 to 300 V) with a 300-V 
alley. Groups of 7 rats for each voltage reduction and 
each alley intensity. 


r, (delay % of the variance 
in sec) accounted for 
.63 97.8 
80 98.6 
225 94.9 
.04 92.9 


400-V alley: 
k = 228.0 (running speed units) 
Y= 701.0 (V) 
92.9%, 
300-V alley: 
k = 106.0 (running speed units) 
1, = 1250 (V) 
98.9% 

Bower, Fow ter, & TRrapotp (1959): Running speed (100/ 
time in sec) in a straight alley (250-V alley intensity) for 
three different reductions in voltage (50 to 200 V). 
Groups of 5 rats for each voltage reduction value. 

k = 185.0 (running speed units) 
Tr, = 338.0 (V) 
99.6% 
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DinsMoor & HucHeEs (1956): Response speed (100/latency 
in sec to lever press) for four different durations of time- 
out (5 to 40 sec) from .2- and .4-mA shock. Groups of 5 
rats per value of time-out and intensity. 

mA: 

k = 45.6 (100/latency in sec) 
r, = 89.4 (sec time-out) 
be e/a 
AmA: 

k = 24.3 (100/latency in sec) 
r, = 6.7 (sec time-out) 
74.19, 


Harrison & ABELSON (1959): Rate of lever pressing for four 
different durations of time-out (2 to 20 sec) from loud 
noisc (116 db.) One rat exposed to all four time-out du- 
rations. 

kL, = a | (response / min} 
19 = AS (sec time-out} 


Oe 


SEWARD, SHEA, UVEDA. & RASKIN (1960): Running speed 
(100 /time in sec) in 4 straight alley for three different 
voltage reductions (65 to 315 V) with two different alley 
intensities (315 and 255 V). Groups of 9 rats for each 
voltage reduction and alley intensity. 

315-¥V alley: 
k = 161.0 (running speed units) 
r, = 62.0 (V) 
ed bat 
295-Y alley: 
k= 146.0 (running speed units) 
Tie CO (Y) 
86.0%, 

Woobs, DAVIDSON, % PETERS (1964): Swimming speed (J00/ 
time in sec) in a cold-water tank (15°C water) for three 
different increases in water temperature (5 to 25°C). 
Groups of 10 rats per temperature increase. 

fi = 163.0 (swimming speed units) 
Ts = 18,0 (°Q) 
93.694 

Woops &® HOLLAND (1966): Swimming speed (100/time in 
sec) in a cold-water tank for three different increases in 
temperature (4 to 16°C) with two different water tank 
temperatures (15 and ape G). Groups of 16 rats for each 
temperature increase and water tank temperature. 

95°C tank water: 
k — 108.0 (swimming speed units) 
T= 19 CG) 
85:9, 
15°C tank water: 
k = 114.0 (swimming speed units) 
(eG) 
87.97, 


APPENDIX F 


Least-squares fit of Equation 11 to the data from stud- 
ies of immediacy of negative reinforcement (1/delay 


in sec). The values of k and r, and the percentage of 
the variance in the dependent variable accounted for 
by the equation are shown. 


FOWLER & ‘TRAPOLD (1962): Running speed (100/time in sec) 
in a straight alley (240-V alley intensity) with shock off- 
set in the goal box delayed for five different delays be- 
tween | and 16 sec. Groups of 5 rats per delay value. 

k = 94.0 (running speed units) 
Yr, = -06 (1/delay in sec) 
O 


Leeminc & Rosinson (1973): Speed (100/latency in sec) of 
escape in a shuttlebox for five different immediacies of 
offset of a 420-V shock (1- to 16-sec delay). Groups of 10 
rats per delay value. 


k = 39,4 (100/latency in sec) 
ry, = 11 (1/delay in sec) 
86,004 
Morrarr & Koc (1973): Speed (1 /latency in .Ol s€c) of 
panel dépression to @scape timé-out from a comedy 
record for three different immediacies of reinstatement 
of thé récord (3- to 9-séc delay). Groups of 10 human 
subjects per delay value. 


k = 304.1 (1/latency in .01 sec) 
1,a= 2,01 delay 11sec) 
96.6%, 


Tarpy (1969): Speed (100/latency im sec) of lever-press 
Seca pe PeePponse on either of two levers in a discrete-trial 
situation for five different immediacies of offset of a 200- 
V shock (1- to Il6-sec délay). Groups of 10 rats per delay 
value. 


= 99.] (100 /latency in sec) 
i 383 d /delay in sec) 
BO 6, 


Tarry & Koster (1970): Speed (100 /latency in Sec) of lever- 
presi eicape response in a discrete-trial situation for 
three different immediacies of offset of a 200-V shock 
(1.5- to 6-séc délay). Groups of 10 rats per delay value. 

kh — 45.5 (100 /latency in sec) 
r, = 40 (1/delay in sec) 
94.6% 
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Conditioned Reinforcement 


INTRODUCTION 


The topic of conditioned reinforcement has tradi- 
tionally been concerned with the question whether a 


previously nonreinforcing stimulus could bécome a 


reinforcer through some conditioning operation.! The 
answer to such a question depends in large part on 
the way the reinforcement process is conceptualized. 
In the traditional view, reinforcement has been con- 
sidered a static property of a stimulus, Stamuli could 
thus be categorized as reinforcers or nonreinforcers. 


* The author thanks W. H. Morse, who stimulated my early 
research on chained schedules, R. T. Kelleher, and A. C. Catania 
for provocative discussions over many years, and D, A. Stubbs for 
helpful comments on an earlier draft of this paper. A cordial 
environment for writing the review, and helpful support, were 
provided by Dr. T. Yanagita and K. Ando of the Central 
Institute for Experimental Animals, Kawasaki, Japan, while the 
author was on sabbatical leave from the University of Maryland. 
Thanks are due to S. Loftus, R. Crovo, and K. Flowers for help 
in preparing the manuscript. ‘This work was supported, in part, 
by U.S.P.H.S. Research Grant MH-01604. 

1‘This review deals only with positive reinforcement, i.e., 
Operations involving the presentation of stimuli that increase the 
probability of responses they follow. Negative reinforcement is 
discussed in Chapters 7, 13, and 14. Further references in 
this chapter to reinforcement should always be understood to in- 
clude only positive reinforcement. 
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schedule effects® 


Lewis Gollub 


If categorized as reinforcers they would be expected to 
increase or maintain the frequency of responses they 
follow. 

More recent conceptions of the reinforcement pro- 
cess emphasize the dynamic nature of reinforcement 
(.g., sce Morse & Kelleher, chapter 7 of this volume). 
In this view, the reinforcement effect of a stimulus 
depends on many factors, such as the organism's de- 
privation, its past history of exposure to the stimulus, 
specific relations between the behavior elicited or con- 
trolled by a stimulus and the operant behavior under 
study (Premack, 1965), and, quite importantly, the 
schedule of stimulus presentation. In earlier treat- 
ments of conditioned reinforcement, however, the 
reinforcing effectiveness of the conditioned reinforcer 
was treated as a fixed property of the stimulus which 
could reveal itself under arbitrarily different condi- 
tions. ‘The paradigm for studying conditioned rein- 
forcement was a two-part experiment in which Part 
One consisted of “training,” where a given type of 
association with a primary reinforcer was used to im- 
bue some arbitrary stimulus with conditioned rein- 
forcing properties, and Part Two was a “test” of 
whether the stimulus was, in the absence of the 
primary reinforcer, a conditioned reinforcer. The 
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major question was, “Under what conditions would a 
stimulus that was previously ineffective become a rein- 
forcer?”’ In attempting to answer this question, the 
effect of the putative conditioned reinforcer was 
assessed during extinction with respect to primary 
reinforcement. ‘The extinction techniques thus 
avoided the possible confounding of response main- 
tenance by primary reinforcement in the test of the 
conditioned reinforcer. 

In experiments with the extinction methods the 
effects of the putative conditioned reinforcer have 
generally been weak, sustaining either relatively few 
responses, or lasting for only a short time. ‘This 
lability presumably resulted from the fact that the 
association with the primary reinforcer which estab- 
lished the conditioned reinforcer was, by definition, 
broken during the extinction test. (Detailed reviews of 
the experimental literature using extinction tech- 
niques can be found in Kelleher & Gollub, 1962; 
Miller, 1951; Myers, 1958; Wike, 1966.) In addition to 
their empirical frailty, the putative conditioned rein- 
forcers were questioned on the grounds that the old 
response was maintained because of discriminative 
effects of the stimulus rather than reinforcing effects 
(see the previous reviews and Lott, 1967; Schuster, 
1969; Wike, 1969). Thus, the question of whether an 
arbitrary stimulus can acquire a long-lasting reinforce- 
ment effect, and replace a primary reinforcer, fre- 
quently degenerated into inconclusive debate on al- 
ternative stimulus functions. 

More recent experiments on conditioned reinforce- 
ment have emphasized response maintenance. In these 
procedures, responding is maintained by stimuli that 
have continued association with a primary reinforcer. 
Besides avoiding the traditionally intractable ques- 
tions that arise in the context of extinction pro- 
cedures, these newer techniques are also related more 
closely to the analysis of maintained responding gen- 
erally, and the sequential properties of behavior in 
particular (cf. Kelleher, 1966a). 

This review will be concerned primarily with the 
study of two paradigms, chained and second-order 
schedules of reinforcement. Both paradigms involve 
sequential response requirements, and the presenta- 
tion of stimuli that bear a scheduled relationship to 
primary reinforcement. In both paradigms behavior is 
continually maintained by primary reinforcement. 


CHAINED SCHEDULES OF REINFORCEMENT 


In a chained schedule of reinforcement, a single 
primary reinforcement follows the completion of a 
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sequence of individual schedule requirements, each of 
which is accompanied by a characteristic stimulus. For 
example, consider the sequence in a chained schedule 
in which a variable interval (VI) 2-min component 
precedes a fixed-interval (FI) 2-min component. A 
response key is initially lit green. After two minutes, 
on the average, the first peck changes the key color to 
red. ‘The first peck that occurs two minutes or more 
after the key becomes red produces a small ration of 
food. ‘The key then again becomes green, and the 
chain repeats. 

A chained schedule (chain) is denoted by a list of 
the individual schedules in their order of occurrence; 
the component stimuli must also be described. ‘Thus, 
the preceding example would be a chain VI 2-min 
FI 2-min with green and red lights on the response 
key. When the component schedules are identical, a 
further shorthand is sometimes used, which specifies 
the number of components, and the schedule in each 
(sometimes called the wnt schedule). Thus, if both 
components of a two-component chain were FI 2-min, 
or a chain FI 2-min FI 2-min, the schedule could be 
denoted chain FR 2 (FI 2-min). This indicates that a 
primary reinforcer is presented after completion twice 
(fixed ratio 2, or FR 2) of the unit schedule require- 
ment, FI 2-min. A characteristic stimulus is associated 
with each of the two components (cf. Kelleher, 1966a). 
The parentheses can be read as in mathematical func- 
tional notation, a chain FR 2 of FI 2-min. This nota- 
tion is particularly helpful in contrasting results 
under chained schedules with those under other 
sequential schedules, such as tandem and second-order 
schedules. 


Overview 


Responding in individual components of chained 
schedules is primarily under the control of two vari- 
ables: the component schedule and the temporal loca- 
tion of the component with respect to primary rein- 
forcement (cf. Staddon, 1972). ‘The analysis of chained 
schedules thus emphasizes two aspects of behavior: the 
temporal pattern of responding in each component, 
and the overall amount of responding in each com- 
ponent. In general, the pattern of responding in each 
component tends to resemble the pattern that occurs 
when responding under the same schedule produces 
primary reinforcement. Response rate increases from 
component to component towards primary reinforce- 
ment. 

The major emphases of research on chained sched- 
ules have been the acquisition of responding under 
chained schedules (transition performances, cf. Sid- 
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man, 1960); parametric investigation of factors that 
determine the level of maintained responding; and, 
the analysis of the behavioral functions of the com- 
ponent stimuli of chained schedules. 


Transition Performances Under Chained Schedules 


In a noteworthy series of experiments, Ferster and 
Skinner (1957) examined transition and steady-state 
performances in pigeons under chained schedules fol- 
lowing training on related simple and multiple sched- 
ules. Chey studied two-component chained schedules 
comprised of cight of the possible nine permutations 
of FI, FR, and VI schedule components.2 Pigcons ex- 
posed to these schedules had previously been trained 
under related compound schedules with the same 
component stimuli. For example, a pigeon wag ex- 
posed to 4 FI l-min schedule with a blue key, alter- 
nating with a red key during which pccks had no 
conscqucnces (EXT). The blue key light was presented 
independently of key pecks after varying durations 
of the red key light. This arrangement comprised a 
multiple (mult) schedule, mult EXT FI I-min. A 
chained schedule was later arranged by presenting the 
blue light and its associated FI lamin schedule de- 
pendent on a peck after varying durations of the red 
light that averaged | min, a VI I-min schedule, The 
chained schedule was thus chain VI 1-min FI 1-min. 
Under the mult EXT FI l-min schedule, very low 
rates of pecking occurred while the key was red. and 
pecking began a short time after the key became blue 
atid continued until food presentation. Within the 
first hour under the chained schedule responding in 
the presence of the red key light had increased from 
néar 76rh td about 15 pecks per minute, and increased 
stall mere in the second session. Responding in the 
terminal component (blue key light) was unchanged 
from the performance during the multiple sehedule. 

Ferster and Skinner (1957) also studied chained 
echedule performance after training on multiple 
schedules with food reinforcement in both compo- 
nents, «gy a mult VI 3-min FI 1l-min schedule. When 
performances in both components were stable, pecks 
in the presence of the VI 3-min stimulus produced not 
food, but rather the stimulus associated with the FI 1- 
min schedule (chain VI 3-min FI I-min). Both birds 
exposed to this training regimen “substantially lost 
the performance on VI 3 formerly prevailing under 
the multiple schedule before developing a chained 
performance” (p. 660). In brief, when the stimulus 
presented in the initial component of the chained 


2 For definitions of these simple schedules of reinforcement, 
and descriptions of performances they generate, see Chapter 8. 
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schedule had previously accompanied extinction, re- 
sponse rates increased under the chained schedule: 
when pecks in the component had previously pro- 
duced food, pecking decreased under the chained 
schedule (cf. Ferster, 1953). 

Transition performances have not been explored 
very extensively since Ferster and Skinner’s observa- 
tions. Gollub (1958) examined responding under 
2- and 5-component chained schedules in pigeons that 
had previously been trained under comparable 
tandem schedules. In a tandem schedule, successive 
completion of two or more component schedules is 
required before primary reinforcement, but a single 
stimulus accompanies al] components. In Gollub’s 
experiments, the pigeons had never been exposed to 
the key light colors associated with the earlier com- 
ponents of the chained schedule. Gollub found that 
response ratcs in the initial component of 2-COMpo- 
nent chains decreased during the first and second ses- 
sions, compared to rates under the tandem schedule, 
and then increased, typically to a value higher than 
that attained in the initial component under the 
tandem schedule. One pigeon studied under tandem 
and chained schedules with five components had 
consistently lower rates in the initial chain compo- 
nent. Kelleher and Fry (1962) also found marked de- 
creases in the response rate in the initial component 
within three sessions of chain FR 8 (FI 1-min) follow- 
ing 30 sessions under the comparable tandem sched- 
ule. ‘Thus the changes in performance under chained 
schedules seem to depend on both the earlier experi- 
mental history of the organism (extinction versus food 
reinforcement) and the number of components in the 
chained schedule. 

Such effects do not appear to depend on the specific 
training procedures, at least under some schedule 
values. A gradual increase to 60 sec in the interval 
durations of chain FI X FI X did not lead to terminal 
rates different from those attained by pigeons that 
were exposed to chain FI 60-sec FI 60-sec immediately 
after key peck acquisition (Gollub and Vogt, 1970). In 
a related study, Switalski and Thomas (1967) at- 
tempted to trace explicitly the development of stim- 
ulus control in the last 2 components of chain FR 3 
(VI 40-sec) in pigeons. An unlit key marked the initial 
component, and a monochromatic light (550 nm) and 
a line were presented in the last 2 components, with 
counterbalanced orders for the two groups of pigeons. 
Sumulus generalization gradients for pecking were 
determined for line tilt, and wavelength, after the 
fifth and twelfth training sessions respectively. At both 
points in training, typical performance for a chained 
schedule was demonstrated. At the first generalization 
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test (after five training sessions) the generalization 
gradients were steeper for the terminal component 
than for the penultimate component. By the 12th 
session, about equal degrees of control were shown in 
each component by the chain. ‘To paraphrase Swital- 
ski and Thomas (1967), stimulus control developed 
sequentially from the end of the chain forward. 

Less attention has been paid to the temporal pat- 
terns of responding that develop under chained sched- 
ules. The evidence presented by Ferster and Skinner 
(1957), as well as in later reports, shows that although 
the temporal patterns within components of chained 
schedules resemble the patterns generated by the cor- 
responding schedules of food reinforcement, there are 
also qualitative and quantitative differences. Respond- 
ing to the initial components of chained schedules 1s 
more irregular, so that, for example the steady rate 
typically seen under VI schedules of food reinforce- 
ment is replaced by alternating periods of responding 
and pausing (rough grain). ‘There are pauses at the be- 
ginning of both FI and VI initial components that are 
longer than those typically observed with food rein- 
forcement. These effects have, unfortunately, not been 
subjected to experimental analysis. 


Maintained Responding Under Chained Schedules 


A major concern in research on chained schedules 
is the effect of the number, type, and quantitative re- 
quirements of the component schedules, ‘Two effects 
are commonly reported. First, response rate tends to 
increase from earlier components to later components. 
Second, the rate of responding in a given component 
is determined by its separation from food presenta- 
tion, and is higher when the component is followed by 
fewer components or when later components are short 
or require few responses. To some extent, number of 
components and the schedule requirement in each 
component appear to be interchangeable in deter- 
mining temporal separation from food. 


CHAINED INTERVAL SCHEDULES 


Two experimental paradigms have been used to 
investigate the effects of duration of interval schedule. 
In one paradigm, only the terminal component 1s 
varied. In the second paradigm, all components are 
varied simultaneously. 

On the basis of the data then available, Kelleher 
and Gollub (1962) concluded that the rate of respond- 
ing in the initial component of 2-component chained 
schedules varied positively with the frequency or 
probability of food delivery in the terminal compo- 
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nent. This relation holds both when food delivery is 
dependent on responding in the terminal component 
(Findley, 1962) and when food is presented indepen- 
dently (Kaufman & Baron, 1969). 

In chained schedules with more than two compo- 
nents (“extended chained schedules,’ Kelleher & 
Gollub, 1962) the extent to which responding is sus- 
tained in the initial components also depends on the 
time that elapses from the end of the components to 
food reinforcement. 

The profound effect on responding of the number 
and duration of fixed-interval components is illus- 
trated in the following two experiments. In one 
experiment Gollub (1958) studied key pecking under 
chained schedules with 2, 3, 4, and 5 components, each 
of which was FI 30 sec, Responding in a given com- 
ponent was lower the farther the component was from 
food presentation. Figure 1 shows the mean rates of 
responding in each component as a function of the 
number of components intervening before food. A 
single curve connects the mean values. It can be seen 
that maintained response rate decreased steeply as 
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Fig. 1. Mean response rate in each component of chained sched- 
ules with 2, 3, 4, and 5 components, each FI 30-sec, plotted as a 
function of the number of components that intervene until the 
terminal component. The solid line connects the mean rates. 
Note that a logarithmic Y-axis is used. (Adapted from Gollub, 
1958.) 


292 


a function of the increasing length of the chain. 
Maintained response rates under three-, four-, and 
five-ccomponent chained schedules with different FI 
schedules in the components also were controlled 
largely by their separation from food reinforcement 
(Gollub, unpublished data). Figure 2 shows rates in 
each component of the following schedules, chain 
FR 3 (FI 45-sec), chain FR 4 (FI 30-sec), and chain FR 
5 (FI 15-sec), plotted as a function of the mean time 
that the midpoint of each component preceded food 
presentation. ‘To a first approximation, rate was a de- 
creasing exponential function of time from reinforce- 
ment. These experiments show that response rate de- 
creases rapidly as components are separated from food 
presentation by even modest amounts. ‘Thomas (1967) 
also found that the earlier the component, the more 
profound the rate change as schedule value was varied 
in chain FR 3 (FI X) from 0.25 min to 2 min. 
Complex behaviors can also be maintained under 


chained schedules. Boren and Gollub (1972) studied 
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Fig. 2. Mean response rate in each component of chain FR 3 
(FR 45-sec), plotted as filled circles, chain FR 4 (FI 30-sec), 
plotted as triangles, and chain FR 5 (FI 15-sec), plotted as open 
circles. Each value is plotted as a function of average time at 
which the midpoint of the component preceded food delivery. 
Note that a logarithmic Y-axis is used. (Gollub, unpublished 
data.) 
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matching to sample in pigeons under chain FR 3 (FI 
X), where X was varied from 16 to 128 sec. The rate of 
matches (pecks to a side key whose color matched the 
previously displayed color on the center key) increased 
from beginning to end of chain, and showed an accel- 
erating pattern (scallops) within components. 

The effects of a given temporal separation are 
greater for FI component schedules than for compar- 
able VI schedules. For example, Gollub (1958) showed 
that a 5-component chained schedule with a FI 1-min 
in each component generated extensive pauses, 
especially in the first two components; with VI I-min 
components, steady responding was sustained even 
though the components were as distant from food as 
with FI components. 

An experiment on 2-component chained schedules 
by Kendall (1967) suggests that the occasional presen- 
tation of short intervals in VI schedules constitutes 
the important difference between VI and FI compo- 
nents (cf. Catania & Reynolds, 1968). Kendall (1967) 
found higher rates in the first component of chain 
VI 1-min VI I-min when the terminal VI 1-min sched- 
ule provided the first food delivery after 0.25 min 
than when the first food presentation was after 1.75 
min. 


CHAINED RATIO SCHEDULES 


Two experimental paradigms have been used to 
study chained ratio schedules. These paradigms 
parallel those used with chained interval schedules. In 
one paradigm, changes in the response requirement 
are made in only one FR component, and all other 
components are held constant. ‘he other components 
may be fixed ratio, or another schedule. In the second 
paradigm, in which all components are fixed ratio, all 
of the components are changed simultaneously, 

An experiment of the first type was reported by 
Hanson and Witoslawski (1959). They showed that 
responding in the initial component of chain FI FR 
decreased as the number of responses required to pro- 
duce food delivery in the second component was in- 
creased from 5 to 120 responses. Since increases in FR 
requirement produced correlated increases in the 
duration of the terminal component, changes in the 
initial component rate may reflect changes in the 
duration of the terminal component rather than in 
response requirement (cf. Killeen, 1969). Findley 
(1962) showed that changing the ratio requirement in 
only one component of chain FR FR FR affects re- 
sponding in only the changed component and those 
preceding it, with quantitatively greater effects earlier 
in the chain. 
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An example of the second paradigm for studying 
chained FR schedules, in which the distribution of a 
given response requirement in different components 
is varied, was reported by Jwaideh (1973). She com- 
pared responding under 3- and 5-component chained 
FR schedules in which the total number of responses 
required for food presentation was constant. For 
different pigeons, the total number of pecks required 
was 60, 120, or 180. For example, one pigeon was 
studied under a chain FR 3 (FR 20) and chain FR 5 
(FR 12). Post-reinforcement pauses were longer under 
the fiveccomponent chains than under the three- 
component chains. Response rates following the pause 
did not, however, vary systematically with different 
chained schedules. In a similar experiment, Ferster 
and Skinner (1957) found longer pauses before peck- 
ing began under chain FR 4 (FR 30) than under 
chain FR 15 FR 105. Segmenting a total number of 
responses into differing numbers of components seems 
to have a greater effect on responding than corre- 
sponding segmentation of an interval (cf. Figure 2). 


OTHER COMPONENT SCHEDULES 


Experiments on chained FI or chained FR compo- 
nents generally reveal a monotonic increase in re- 
sponse rate from beginning to end of the chained 
schedule sequence. Such rate patterns are due to the 
specific component schedules, and are not necessary 
consequences of the chaining operation per se. Re- 
quiring a low rate in the terminal component can 
reverse the overall rate increase. Ferster and Skinner 
(1957) scheduled a chain FR 95 DRL 6-sec in which 
the ninety-fifth peck changed the key from red to 
purple; the first peck that was spaced six seconds or 
more from a preceding peck while the key was purple 
produced food delivery. High rates of responding fol- 
lowed a brief pause during the initial red light compo- 
nent, and a very low rate prevailed during purple 
until the DRL requirement was met. The entraining 
effect of periodic food delivery is thus only one factor 
in controlling regular patterns of responding under 
chained schedules. 


SUMMARY OF SCHEDULE EFFECTS 


Parametric investigations of chained interval and 
chained ratio schedules implicate three important fac- 
tors controlling response rate: the type and the num- 
ber of component schedules, and the requirement in 
each. ‘These variables operate in the context of re- 
curring food presentations (cf. Staddon, 1972). When 
the component schedules are all of the same type (e.g., 
all fixed ratio), the average response rates in the initial 
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component is lowest and the rate increases throughout 
the chain until food presentation occurs. The response 
rate in a component is controlled primarily by the 
separation in time of that component from food 
presentation, so that number of components and the 
schedule requirement in each are interchangeable 
when these temporal relationships are constant, as 
under interval schedules. Changes in overall response 
rate are, however, modulated by the component sched- 
ules so that temporal patterns of responding in a 
component resemble those typically found under the 
same isolated schedule of food presentation. 

Previous accounts of chained schedules (Ferster & 
Skinner, 1957; Kelleher & Gollub, 1962; Kelleher, 
1966a) have appealed to a dual function of component 
stimuli as both discriminative stimuli that control a 
rate and pattern of responding appropriate to the 
component schedule, and conditioned reinforcing 
stimuli for behavior in a preceding component. This 
interpretation has proved to be exceedingly difficult to 
analyze experimentally, at least in part because un- 
ambiguous and independent measures of stimulus 
control and reinforcement strength are not currently 
available. The next section of this chapter reviews 
some of the ways in which stimulus factors have been 
investigated. 


Stimulus Functions in Chained Schedules 


Four types of manipulation have been used to 
study the role of stimuli in chained schedules: (1) 
Omitting stimulus changes with otherwise identical 
schedule contingencies (tandem schedule): (2) Chang- 
ing the order of presentation of the component 
stimuli; (3) scheduling the sequence of stimuli inde- 
pendent of responding; and (4) presenting the compo- 
nent stimuli for brief, response-dependent exposures. 


"TANDEM SCHEDULES COMPARED TO 
CHAINED SCHEDULES 


A tandem (tand) schedule is one “in which a single 
reinforcement is programmed by two schedules acting 
in succession without correlated stimuli” (Ferster & 
Skinner, 1957, p. 733). Thus, for every chained sched- 
ule there is a corresponding tandem schedule, with the 
same response requirements. The system for denoting 
tandem schedules is similar to that for chained sched- 
ules. Dissimilar component schedules are listed after 
the schedule abbreviation, thus, tand FI 2-min FR 5. 
When the same schedule appears in each component, 
a denotation like FR X (FI I-min) is used. Note in 
this case that no special term is needed to indicate a 
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Fig. 3. Tandem and chain FI 3-min FI 2-min. Record A is the transition from tandem 
to chained schedule (at the arrow). Record Al is the continuation of that session. 
Records B, C, and D show further development in the chained schedule: Record B is 
from the second, C from the fourth, and D from the eighth session on the chained 
schedule. Pips indicate change from oné schedule component to the next. Small dots 
above the record indicate food deliveries. (From Kelleher & Gollub, 1962. © 1962 by the 
Society for the Experimental Analysis of Behavior, Inc.) 


tandem schedule since it can be described as a second- 
order schedule (cf. Kelleher, 1966a). 

Gollub (1958) compared the performances of 
pigeons under tandem FI schedules with later per- 
formances of the same pigeons under the correspond- 
ing 2- and 5-component chained schedules. In all 
cases, the response rate in the terminal component of 
the chained schedule increased over that previously 
maintained in the terminal component of the tandem 
schedule. ‘The rate in the initial component increased 
in two-component chains, but decreased with five 
components. ‘The temporal pattern of responding also 
changed. Figure 3 shows the transition from tand FI 
3-min FI 2-min to chain FI $-min FI 2-min. A similar 
result for chain FR 3 (FI 1-min) components was re- 
ported by Kelleher and Fry (1962). 

In some experiments, however, rates in tandem and 
chained schedules have not differed substantially. For 
example, Malagodi, DeWeese, and Johnston (1973) 
studied pigeons under chain FR 2 (FI 2-min) and 
tand FR 2 (FI 2-min), and found comparable average 
rates in both schedules. Appropriate temporal pat- 
terns of responding occurred under the chained 
schedule. 

Performances can be studied under both tandem 
and chained schedules in each session if there is one 
stimulus for the tandem schedule and additional 
stimuli for the chained schedule components (multi- 
ple chained and tandem schedules, cf. Thomas, 1964). 
Gollub (1965) and Thomas (1967) obtained similar 


results studying multiple chained and tandem FI 
schedules. ‘Thomas (1967) studied 3-component sched- 
ules with fixed intervals of .25 to 2.0 min. Generally, 
the rates of responding maintained in the terminal 
component of chained schedules were higher than in 
corresponding tandem schedules with the difference 
decreasing with increasing values of the fixed-inter- 
vals. ‘Ihe rates in the initial and middle components 
of the tandem schedules were higher than those under 
chained schedules, with the difference increasing with 
increasing interval duration. 

Jwaideh (1973), in the experiment described above 
on chained and tandem FR schedules found that the 
pause alter food presentation was generally longer for 
chained than for tandem schedules (cf. Ferster & Skin- 
ner, 1957). In addition, the rate after responding be- 
gan (the running rate) in the first component was 
lower in chained than in tandem schedules in most 
cases, and the mean time to complete the required 
response number was greater for chained than tandem 
schedules, with the difference increasing with increas- 
ing total response requirement. Thomas (1964) also 
found lower average rates in the initial chain compo- 
nent of multiple chained and tandem FR schedules. 

In summary, experiments comparing performance 
under tandem and chained schedules have yielded 
complex results. For two-component chains of FI 
schedules, the rate in the first component under chain 
was generally higher than tandem (Gollub, 1958), but 
not always (Malagodi, DeWeese, & Johnston, 1973). 
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Higher chained schedule rates have been interpreted 
as showing a conditioned reinforcing effect: respond- 
ing that produces the terminal stimulus occurs at a 
higher rate than responding the same distance from 
food that does not change the stimulus conditions. 
Chains with more than two FI components have con- 
sistently shown lower rates in the initial component 
than occur under corresponding tandem schedules 
(Gollub, 1958; Kelleher & Fry, 1962). Similarly, rates 
were lower and pauses were longer in the initial com- 
ponent of three- and five-ccomponent chained FR 
schedules than in corresponding tandem schedules 
(Ferster & Skinner, 1957; Jwaideh, 1973; ‘Thomas, 
1964). 

The comparison of chained and tandem schedules 
is more complicated, however, than might appear at 
first glance. Although schedules that are formally 
identical can be arranged, whether they are sufficiently 
comparable is a moot question. For example, in what 
sense is a FI component of a chained schedule in 
which its duration is marked precisely by a component 
stimulus comparable to a FI component of a tandem 
schedule without accompanying time markers? The 
concept of reinforcement contingencies implies that of 
discriminative stimuli (Skinner, 1969). Comparisons 
among schedules which are as different as tandem and 
chained may involve too many differences to illu- 
minate specific aspects of control by either schedule. 


VARYING THE ORDER OF STIMULI 
IN CHAINED SCHEDULES 


The discriminative and reinforcing functions of the 
component stimuli of chained schedules have been 
investigated by changing their order of presentation. 
Several experiments have used the following tech- 
nique. Responding is first stabilized under a chained 
schedule. The order of stimuli is then changed, and 
the transition and stable performances under the new 
order are studied. In general, the results have shown 
strong control of response rate by the prevailing 
stimulus, so that high rates can be produced in early 
components of the chained schedule when a stimulus 
that has been presented previously toward the end of 
the chain is presented early. In addition, a stimulus 
that used to occur near the end of a chain can increase 
responding when it is presented as a consequence of 
responding in an earlier component. 

After establishing performances under chain FR 3 
(FI 1.5-min), Kelleher and Fry (1962) presented the 
three stimulus components in different orders on suc- 
cessive exposures to the chain. At first, the three colors 
controlled rates very similar to those maintained in 
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their original chained schedule components. For 
example, the reverse order from the chained schedule 
produced a negatively accelerated pattern of respond- 
ing. With continued exposure to the variable order 
of stimuli the pigeons responded at a more or less 
constant rate throughout each schedule sequence, with 
only brief pauses, if any, in the initial component. 
This pattern resembled that under the tandem sched- 
ule, with a single stimulus presented throughout the 
chain. After still further training, a new pattern of 
responding developed. Pauses after food were seldom 
longer than the scheduled interval (1.5 min), and 
miniature scallops comprised of a pause followed by 
accelerated responding typically occurred in both the 
middle and terminal components. Similar effects of 
reversing the order of component stimuli of chained 
FR schedules were reported by Jwaideh (1973) and by 
Ferster and Skinner (1957) with a continuously vary- 
ing stimulus (an added counter). 

The preceding experiments as well as others (Find- 
ley, 1962; Marr, 1971) on the effects of varying the 
stimulus sequence and the relationship of stimulus 
order to food presentation demonstrate that exposure 
to chained schedules develops strong discriminative 
control of responding by the component stimuli. This 
control 1s demonstrated by the maintenance of charac- 
teristic rates by each stimulus component when the 
stimulus is presented in a new sequence. These effects 
demonstrate little, however, about conditioned rein- 
forcing effects of chained schedules. One other manip- 
ulation of stimulus order in chained schedules can 
provide relevant data here: scheduling the terminal 
stimulus in two or more different components of a 
chained schedule. Responding during the early presen- 
tation of that stimulus would show discriminative 
effects, independently of time of presentation. Re 
sponding in the presence of the stimulus component 
that precedes an early presentation of the terminal 
stimulus component would reflect the reinforcing 
effects of that stimulus as a consequence of re- 
sponding. 

Byrd (1971) studied the performances of pigeons 
on chained schedules with 3, 5, or 7 FI 1-min compo- 
nents in which the same key color was presented in 
the terminal component as in some earlier compo- 
nents (e.g., in the Ist, 3rd, and 5th, as well as in the 
7th component). Different colors were presented in the 
other components. Strong rate-controlling effects of 
such repeated stimuli were demonstrated. In all cases, 


higher rates occurred during the early presentation of 


the repeated color than in the component that fol- 
lowed it. In one phase of the experiment Byrd (1971) 
presented one color (amber) in the Ist, 3rd, and 5th 
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components of a 7-component chain, and unique 
colors in each of the other components. The rates in 
those components preceding amber (2nd and 4th) 
were compared with the rates previously maintained 
when the repeated stimulus also was paired with food, 
1.€., was presented in the 7th component. Rates were 
not higher under the latter condition. Byrd (1971) 
concluded that these results did not reveal a con- 
ditioned reinforcing effect of the component stimuli. 

An unpublished experiment by Gollub indicates 
that chain component stimuli presented out of order 
can increase responding that produces these stimuli, 
but that these effects may be transient. Responding 
was maintained under chain FR 4 (FI 1.5-min). ‘The 
color presented in the terminal component was then 
also presented in the second component. Responding 
increased immediately in both the initial and second 
components. Over the succeeding 27 sessions, how- 
eyer, responding in the initial component returned to 
its previous value, and responding in the second com- 
ponent also decreased markedly. Thus, an initial rate- 
enhancing cffect of a chain component stimulus can 
disappear with continued training, With continued 
training, white after orange (or 1.5 min after food) 
gains different discriminative control from while after 
bine (or, 4.5 min after fosd). Byrdl’s (1971) pigeons had 
been exposed to presentation of the terminal stimulus 
at multiple locations in the chain for 115 sessions bes 
fore the comparison mentioned above. It would not 
beé surprisifig if control by the complex stimulus had 
developed by this time. 


Clock SCHEDULES 


A major effect of the initial stimulus, especially in 
chains with more than two components, is discrimina- 
tive. It controls very low rates. By definition, the ini- 
tial stimulus component occurs furthest from the next 
food presentation, and responses in its presence are 
never followed promptly by food. Outside the chain- 
ing situation, such a stimulus would be referred to as 
S4 or S-, and would normally control very low re- 
sponse rates (see Chapter 15). 

An interesting comparison to chained schedules is 
the “added clock” of Ferster and Skinner (1957). They 
presented a spot of light on the key that grew pro- 
gressively longer in a FI period reaching a maximum 
length when the next peck would produce food. ‘The 
effects were quite dramatic. Zero response rates oc- 
curred in early parts of the interval, and were fol- 
lowed, often with a relatively rapid transition, by 
very high rates, sustained until food presentation. 
When the clock was “run backwards,’ with the long 
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line presented immediately after food presentation 
and shrinking to a spot at the end of the FI period, 
response rates were maximal at the beginning of the 
interval, and decreased to zero as time passed. These 
results resemble effects reported by Jwaideh (1973) 
and Kelleher and Fry (1962) of comparable manipula- 
tions of chained schedule component stimuli. (See also 
Segal, 1962; Boren & Gollub, 1972.) 

With pigeons on chained FI schedules, the early 
components frequently exceed their scheduled dura- 
tions, since low rates with extensive pauses prolong 
them. The discrete-value “clock” of the chained sched- 
ule thus has settings with unequal durations. Tallen 
and Dinsmoor (1969) produced such a 3-valued clock 
by presenting the key colors to one pigeon that were 
simultaneously produced by a second pigeon under a 
chain FR 3 (FI) with FI values of 20-sec, 30-sec, or 
45-sec (yoked-box technique). Responding under the 
clock schedule was confined almost entirely to the 
final stimulus (97.6-1009% of all pecks), whereas pi- 
geons under the chained schedule responded through- 
out the chain, with only 41.3-72% of their pecks in 
the terminal component, and the remainder of their 
pecks in the first two components. 

In summary, responding occurs at very low rates in 
the presence of stimuli that are never associated with 
food. When responding in the presence of such stim- 
uli is required for progression through the chain, re- 
gponseée rates increase substantially. Such comparisons 
do not, however, distinguish the effects on responding 
of the response-dependent presentation of subsequent 
chain stimuli from the effects of the response require- 
ment for the ultimate presentation of food. More 
comprehensive désigns incorporating comparable tan- 
dem, chained, and clock procedures would help com- 
pare the discriminative effects of component stimuli 
(clock vs. chain) and the effects of the response de- 
pendencies (chain vs. tandem). 


Brier PRESENTATION OF COMPONENT 
AND GLocK STIMULI 


If component stimuli are reinforcers, a response 
that produces them even outside a chaining situation 
should be maintained. Hendry and Dillow (1966) 
found that clock stimuli from FI 3-min and FI 6-min 
maintained pecks on a second key, with presentation 
of the terminal stimulus maintaining the highest rate. 
Whether each clock stimulus alone could maintain 
pecking was not directly assessed. 

Kendall (1972) investigated this question with a 
similar procedure. Pecks on one key produced food 
under FI 3-min. Pecks on a second key produced a 
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0.2-sec presentation of white, green, or red key-light 
in the first, second, and third minutes of the interval, 
respectively. In two later conditions of the experi- 
ment, pecks on the second key produced red during 
the last minute, and had no consequences during the 
first two minutes, or produced white and green during 
the first and second minutes, respectively. Figure 4 
shows that pecks that produced time correlated stim- 
uli were maintained only when red, the terminal 
minute stimulus, could be produced either as one of 
three stimuli, or as the only stimulus. When only the 
first two stimulus components could be produced, 
responding was not maintained. Whether green pre- 
sented during the middle minute would alone sustain 
responding was not assessed. White, appearing when 
food was remote, might have suppressed pecks mask- 
ing any rate-enhancing effect of response-dependent 
presentations of green (cf, Mulvaney, Dinsmoor, 
Jwaideh, & Hughes, 1974, for a direct demonstration 
of response suppression by S~). The experiment does 
show clearly that responses must produce the terminal 
clock stimulus at least part of the time to be main- 
tained. Since the “information” concerning the occur- 
rence of the terminal minute is the same whether a 
unique positive signal (red) is given or the lack of a 
negative signal (white or green) occurs, this experi- 
ment strengthens the line of argument that association 
with food or other positive reinforcers is necessary for 
the development of response maintaining effects by 
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Fig. 4. Fixed-interval and observing response rates for each of 
the three birds in each of the three conditions. Fixed-interval 
rates are plotted in the upper panel and observing response 
rates in the lower panel for each bird. (From Kendall, 1972. © 
1972 by the Society for the Experimental Analysis of Behavior, 
Inc.) 
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stimuli. Stimuli that provide the same “information’”’ 
concerning elapsed time in the interval, but that have 
no Close behavioral contiguity with food, did not sus- 
tain behavior (cf. Chapter 11). 

Marr (1969) found the corresponding result for 
stimulus components of chained schedules with a com- 
plex one-key procedure. Briefly, higher rates were 
maintained by brief presentations of the terminal 
stimulus than by presenting the initial stimulus, 

In summary, these experiments show that stimuli 
that are serially correlated with elapsed time in the 
interreinforcement interval of either fixed-interval or 
chained schedules control the rate of responses that 
produce them. Brief presentation of the terminal 
stimulus of a 3-component chain will maintain con- 
siderably higher rates than presentations of the first 
stimulus; presentation of the terminal stimulus alone 
will maintain as much responding as presenting all 3 
clock stimuli; and presenting only the first 2 of 3 clock 
stimuli will not maintain responding. 


Concurrent Chained Schedules 


A discussion of chained schedules would not be 
complete without some consideration of concurrent 
chained schedules. (See Chapter 11 for a more com- 
prehensive discussion of this topic.) This procedure 
has a basic appeal both for examination of the param- 
eters of reinforcement in chained schedules (Kelleher 
& Gollub, 1962) and for scaling diverse reinforcement 
variables (cf. Baum & Rachlin, 1969), 

The procedure was developed by Autor in his doc- 
toral dissertation in 1961, and later published (Autor, 
1969). The schedule can be conveniently divided into 
two parts. In the first (concurrent) part, two response 
keys are lighted. Equal VI schedules are associated 
with each key so that a peck occasionally produces the 
terminal component associated with that key. In the 
second (terminal component) part of the schedule, the 
color of the key changes, and the other key is not lit. 
Pecks in the terminal component produce food ac- 
cording to a schedule associated with that key. After 
food, the concurrently lit keys are again available. 

Autor (1969) showed that the relative rates of re- 
sponding during the concurrent part of the schedule 
were similar to the relative frequencies of reinforce- 
ment during the terminal components. This matching 
relationship occurred not only when food presentation 
in the terminal components was response-dependent 
under VI schedules, but also when food was presented 
occasionally and pecking was not permitted. Herrn- 
stein (1964) extended these results by scheduling food 
presentations under variable-ratio (VR) schedules. He 
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found that response proportions in the concurrent 
components tended to match more closely the relative 
temporal frequencies of food presentation than rela- 
tive number of responses per food presentation. Be- 
cause of the growing interest in studying the effects of 
reinforcement variables on relative measures of re- 
sponding (Herrnstein, 1961), many papers followed 
these initial studies. Relative response rate was stud- 
ied when the terminal components differed in: num- 
ber of food presentations (Fantino & Herrnstein, 1968); 
FR versus VR schedules (Fantino, 1967); schedules of 
differential reinforcement of response rate (Fantino, 
1968); amount and frequency of food presentation 
varying jointly (Ten Eyck, 1970); and FI versus VI 
schedules (Davison, 1969). In these and related studies, 
the relative rate of responding in initial concurrent 
links was used to scale preference or value (Baum & 
Rachlin, 1969; Killeen, 1972) of the terminal links. 
Concurrent chained schedules have even been used to 
study reinforcement variables in simple chained sched- 
ules themselves. In these experiments, two concurrent 
initial components lead either to a chained schedule 
or to a uniform stimulus providing food after about 
the same delay as the chained schedule. Schneider 
(1972) compared chain VI VI and chain FI FI sched- 
ules with corresponding tandem schedules. Duncan 
and Fantino (1972) compared chain FI FI with FI, 
and chain FR $ (FI) with chain FR 2 (FI), Unfortu- 
nately, the two studies found strikingly different re- 
sults. Schneider (1972) found essentially equal respond- 
ing leading to chained and tandem schedules, whereas 
Duncan and Fantino (1972) found more responding 
leading to the less segmented schedule (FI versus chain 
FI FL. and 2-component versus 3-component chained 
schedules). Since only a small range of parameters was 
explored in terms of schedule type and schedule vyal- 
ues, 1¢ would be premature to speculate on which pro- 
cedural differences were responsible for the different 
results. AS with many questions about chained sched- 
ules, further research is necessary to clarify the results. 

Surprisingly, in view of the theoretical weight be- 
ing placed on the relative rate data from concurrent 
chains the constraints in the basic paradigm have 
been little explored until recently. For example, in all 
seven studies on concurrent chained schedules men- 
tioned earlier the schedules for terminating the con- 
current components were equal (identical VI 1-min). 

As it turns out, the schedules in the concurrent 
components are important determinants of the rela- 
tive concurrent response rates. Fantino (1969) sched- 
uled food presentations in terminal components ac- 
cording to VI 30-sec and VI 90-sec schedules. Identical 
pairs of schedules were used in the concurrent initial 
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components. Over the experiment these were VI 40- 
sec, VI 120-sec, and VI 600-sec. The mean relative 
response frequency on the key associated with the VI 
30-sec terminal component was 0.95, 0.81, and 0.60, 
respectively, as the initial schedules were varied. Thus, 
response distribution in the initial components de- 
pends in part on the absolute value of the equal, con- 
current schedules. Alternative quantitative formula- 
tions of relative response rates in the concurrent 
components as a function of reinforcement variables in 
the terminal components must include some weight- 
ing for the concurrent schedule values (Davison & 
Temple, 1973; Fantino, Chapter 1] in this Volume; 
Squires & Fantino, 1971). 

Quantitative models to describe behavior under 
concurrent chained schedules have grown increas- 
ingly complex as experimental variables have been 
extended beyond the initial limited values (see Fan- 
tino, Chapter 11 in this volume). In part this may 
reflect a terminological problem. It is a misnomer to 
call this experimental paradigm concurrent chains. 
Concurrent scheduling occurs for only part of the 
schedule. (For one exception, which was characterized 
by perhaps the most complex quantitative relation- 
ships in this field of research, see Fantino & Duncan, 
1972.) This inaccurate terminology may have also 
distracted attention from what appears to be the crux 
of the matter: the organism is enmeshed in a com: 
plex set of contingencies, where, among other things, 
termination of the concurrent components not only is 
followed by one of the terminal components, but also 
delays the presentation of the alternative terminal 
component. Only recently has this point been appre- 
ciated (Duncan & Fantino, 1972). It is naive to con- 
sider concurrent chained schedules and related pro- 
cedures as a simple technique for preference scaling of 
the terminal components. 

In summary, the concurrent chains procedure, un- 
derstood as a name and not a description, is a com- 
plex behavioral situation in which responding in the 
initial components is a function of the absolute and 
relative values of both the initial and terminal com- 
ponents. Much additional research is required to elu- 
cidate the individual sources of control in this situa- 
tion. 


Conditioned Reinforcement in Chained Schedules 


The maintenance of responding under chained 
schedules has often been interpreted in terms of con- 
ditioned reinforcement (Ferster & Skinner, 1957; Gol- 
lub, 1958; Kelleher & Gollub, 1962). The preceding 
review indicates, however, that the strongest stimulus 
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effects that are demonstrable in chained schedules are 
discriminative, i.e., the modulation of responding by 
prevailing stimulation due to previous reinforcement 
history in the presence of these stimuli. If reinforce- 
ment implies enhancement of responding by response- 
dependent stimuli, then only in certain circumstances 
(e.g., when tand FI FI is changed to chain FI FI) does 
response rate increase following response dependent 
presentation of temporally sequential stimuli. The 
difficulty in demonstrating clear reinforcement effects 
of stimulus components of chained schedules has led 
some writers to conclude that these stimuli are not 
reinforcers (e.g., Schuster, 1969). But response en- 
hancement has been reported under certain condi- 
tions: in acquisition following tandem schedules, in 
repeated presentation of stimulus components, in com- 
parisons of chained and equivalent serial stimulus 
(clock) schedules, and in a direct examination of how 
much responding is maintained by response-dependent 
presentations of component stimuli outside the chained 
schedule. Whether responding is enhanced, suppressed, 
or unchanged in a chained schedule compared to be- 
havior under some procedurally related condition 
depends on the specific situations compared. Stimulus 
components of chained schedules may have complex 
behavioral effects that prevent an unambiguous pre- 
diction. 


SCHEDULES OF BRIEF STIMULUS 
PRESENTATION 


Extinction tests of conditioned reinforcement fre- 
quently involved brief presentation of a stimulus that 
had previously been associated with food, such as a 
magazine sound. As discussed above, interpretation of 
these experiments was often ambiguous. Few responses 
were maintained during extinction, at least in part 
because the association between stimulus and food was 
broken, and the reinforcing effectiveness of the stim- 
ulus was itself undergoing extinction. 

More recently, techniques have been developed to 
study the effects on behavior of brief stimuli where 
food presentation continues. These techniques pro- 
vide a chronic situation for the study of brief stimuli 
as Maintaining events with some of the same ad- 
vantages over extinction test procedures that the study 
of schedules of reinforcement has over the traditional 
extinction tests of partial reinforcement. 

T'wo major classes of procedure have been developed 
in the study of response maintenance with brief stim- 
uli. In one, presentations of the brief stimulus are 
scheduled as part of the schedule that arranges the 
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presentation of food. These are second-order schedules 
(Kelleher, 1966a, b). In another procedure, the brief 
stimulus is presented according to an independent 
schedule that is concurrent with the schedule of food 
presentations (J. Zimmerman, 1963). 


Second-order Schedules of Brief 


Stimulus Presentation 


Kelleher (1966a) defined a second-order schedule in 
this way: “A second-order schedule is one in which the 
behavior specified by a schedule contingency is treated 
as a unitary response that is itself reiforced according 
to some schedule of primary reinforcement” (p. 181). 
This comprehensive definition applies to tandem 
schedules (no additional] stimuli), chained schedules 
(different stimuli during each separate schedule), and 
schedules of brief stimulus presentation, in which a 
brief stimulus is presented according to the unit sched- 
ule contingency. For example, Findley and Brady 
(1965) presented food when a chimpanzee pressed a 
key 4000 times (FR 4000) in the presence of a red 
light. In the presence of a green light, food was also 
presented on the 4000th response; in addition, the 
food hopper was lighted (designated as 5) for 0.5 sec 
after every 400 responses (FR 400:5). In the terminol- 
ogy of second-order schedules, the schedule of food 
presentation is denoted as FR 10 (FR 400:S). ‘That is, 
food is delivered upon the tenth repetition of the FR 
400 schedule. ‘To facilitate description of these sched- 
ules, the schedule requirement for stimulus presenta- 
tion will be referred to as the unit schedule.? The 
following review will consider the effects of different 
types of unit schedules on response rate and response 
pattern, and will then analyze some of the controlling 
variables. 


OvERVIEW 


Stimuli that are paired with food or other reinforc- 
ers can maintain long, orderly sequences of responding 
when they are presented according to second-order 
schedules. The pattern of responding is appropriate 
to the unit schedule, and the overall rate of respond- 


3.No standard vocabulary has emerged to describe the dif- 
ferent parts of second-order schedules. It has been suggested 
that the schedule according to which the brief stimulus is 
presented (FR 400 in this present example) be called the swb- 
ordinate schedule, and the schedule for arranging food (FR 10), 
the superordinate. The priority implied in such a nomenclature 
may not always apply, especially when schedules of the type 
FR (FI) are studied. In that case, the superordinate schedule 
would bear no important relationship to the ensuing behavior, 
which more closely resembles that under FI. Similarly, some have 
suggested food schedule, instead of superordinate. Again, it is the 
entire schedule, FR (FI), that arranges food presentation. 
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ing is frequently higher than that maintained by the 
corresponding second-order schedule without a brief 
stimulus (tandem schedule). The interpretation of 
these results is, however, subject to controversy. In- 
creased rate under some circumstances can be attrib- 
uted to discriminative effects of the brief stimulus 
with respect to food delivery. Other experiments have 
examined the relationship of brief stimulus to food 
presentation: no single temporal relationship seems 
necessary. 


PATTERNS OF RESPONDING UNDER 
SECOND-ORDER SCHEDULES OF 
BRIEF-STIMULUS PRESENTATION 


Fixed. and Variable-interval Unit Schedules. Long, 
orderly sequences of behavior can be maintained un- 
der second-order schedules of bricf-stimulus presenta- 
tion with FI unit schedules. Kelleher (1966b) scheduled 
food presentation to pigeons under FR 30 (FI 2-mini§) 
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Fig. 5. Effects of omitting presentations of the white light at 
the end of each FI 4-min component (Bird 128). Each cumulative 
response record shows a sequence of 15 consecutive FI 4-min 
components; each sequence terminated with food reinforce- 
ment. Short diagonal strokes on the records designated FR 15 
(FI 4-min:W) indicate 0.7-sec presentations of white light. Under 
the FR 15 (FI 4-min) schedule, there were no ex teroceptive 
stimulus changes during the sequence; short diagonal strokes 
indicate the end of each FI 4-min component. (From Kelleher, 
1966b. © 1966 by the Society for the Experimental Analysis: of 
Behavior, Inc.) 
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and FR 15 (FR 4-min:S) where S was a change in key 
color from blue to white for 0.7 sec. Under both sched- 
ules, the minimum time between food presentations 
was 60 min. For comparison, behavior was also stud- 
ied under the corresponding tandem schedules. Three 
aspects of the results are important. First, under all 
four schedules (15 and 30 unit schedules, with and 
without brief stimulus presentations) behavior was 
well maintained over the one-hour periods separating 
food presentations. This is in marked contrast to the 
effects of chained schedules, in which substantial 
pauses usually occur when even 5 components of FI 
I-min are scheduled (cf. Squires, Norborg, & Fantino, 
1975). Second, the average rate of responding was low 
for first few unit schedule intervals, and then in- 
creased. ‘I’his is illustrated by the cumulative records 
shown in Figure 5. Third, the brief stimulus con- 
trolled the temporal pattern of responding. There 
was usually a pause after each brief stimulus, fol- 
lowed by either a gradual or an abrupt acceleration to 
a moderately high rate until the next stimulus presen- 
tation. The maximal rate in each unit of the sched- 
ule sequences with the brief stimulus was generally 
higher than the maximal rate reached under the 
schedule without brief stimuli (11 out of 12 com- 
parisons). ‘his difference is illustrated in Figure 6 
which shows mean response rates in successive quar- 
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Fig. 6. Effects of omitting presentations of the white light on 
mean rates of responding in each quarter of the fixed-interval 
components (Bird 128). Each point is the median of the mean 
rates of responding in the last five sessions under each pro- 
cedure. Solid circles: white light presented at termination of 
each fixed interval component; triangles: no exteroceptive 
stimulus change at termination of fixed-interval components; 
open circles: redetermination of effects of presenting white light. 
(From Kelleher, 1966b. © 1966 by the Society for the Experi- 
mental Analysis of Behavior, Inc.) 
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ters of the unit schedule interval under schedules 
with (circles) and without (triangles) brief stimulus 
presentations. Figure 6 also shows that the overall 
rate of responding, 1.e., the mean number of re- 
sponses per food presentation, was generally greater 
under the schedule of brief stimulus presentation than 
under the comparable tandem schedule. Boren (1973) 
found similar patterns of responding when pecks on a 
matching-to-sample procedure produced food under 
FR (FI:S) schedules. 

Reinforcement omission, or percentage reinforce- 
ment of FI schedules, also involves second-order FI 
schedules. In this paradigm, responding is stabilized 
under a FI schedule, and food delivery is then omitted 
at the end of some percentage of the intervals. A time 
out (a period in which all lights in the experimental 
chamber are turned off) or other brief stimulus oc- 
curs instead. The sequence of intervals ending with 
and without food is irregular. ‘The total schedule com- 
prises a VR (FI:S), in that all intervals end in a brief 
stimulus (S) and after a variable number of these 
intervals, food also follows.* Staddon and Innis (1969) 
found that responding began much sooner in those 
2-min fixed intervals which were preceded by time out 
than in those preceded by food. Mean response rates 
were therefore higher under the second-order schedule 
than under fixed-interval. The pause at the beginning 
of the interval varied with the duration of time out, 
which ranged from 2 to 32 seconds. Similar effects of 
omitted food delivery were demonstrated by Staddon 
(1970, 1972, 1974) and Kello (1972). 

In a related experiment, Zeiler (1972) found that 
the smaller the percentage of intervals ending with 
food, the greater the responding earlier in the inter- 
val. As in a number of other experiments with this 
paradigm, Zeiler (1972) found that time out was an 
important factor in obtaining temporal control 
through the interval. In a related experiment, Zeiler 
(1972) studied the effects of presenting a 10-sec black- 
out after every interval in addition to either a 4-sec 
presentation of food or 4-sec time out that substituted 
for food. He found that responding was approx- 
imately equal in all intervals, whether they ended in 
food or not. 

Response rate increases in the reinforcement omis- 
sion experiments are not necessarily related directly to 
the occurrence of higher rates with schedules of brief 
stimulus presentation. In the former experiments, 


4'To be precise, this is a second-order schedule with unpaired 
stimulus, since time out does not precede food when the latter 
was presented. Some differences between paired and unpaired 
brief stimuli, and between blackout and other events as the 
brief stimulus, will be discussed later. 
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each food presentation typically follows between 2 
and 4 unit schedule sequences, compared to as many 
as 64 in experiments on second-order schedules. Also, 
response rates in these experiments are generally com- 
pared to rates under a simple FI, or involve compari- 
sons of performance following food with those follow- 
ing time out, whereas in Kelleher (1966a, b) and most 
other experiments, comparisons are made to tandem 
schedules. ‘The two research paradigms thus cover 
different values of schedule, and make comparisons 
with different baselines. ‘The research on reinforce- 
ment omission has, however, contributed to the analy- 
sis of interpretation of the behavioral function of the 
brief stimulus, to be discussed later. 

Short, response-initiated fixed-interval unit sched- 
ules were studied by Neuringer and Chung (1967), 
Chung and Neuringer (1967), and Neuringer (1968). 
Chung and Neuringer (1967) studied a procedure in 
which a key peck started a fixed interval of 1 sec to 
30 sec duration. ‘The first peck after the interval 
elapsed produced either a 1-sec blackout (S) or, after 
variable intervals averaging one minute, food. The 
schedule can thus be notated VI I-min (tand FR 1 FI 
X:S), with S unpaired with food. The pause betore 
the first peck after blackout increased linearly with 
interval duration; the mean rate of responding during 
the fixed-interval period decreased as a negatively ac- 
celerated function of interval duration (cf. Starr & 
Staddon, 1974). In some parts of the experiment, food 
was scheduled after every unit schedule, a FR | (tand 
FR 1 FI X). Response rates under different values of 
the fixed interval were similar to those with VI 1l-min 
in the second-order schedule, and the pause in these 
intervals was lower than when VI was scheduled. 
Chung and Neuringer (1967) reported similar rates 
for tand FR 1 FI schedules as for fixed ratio (FR 11). 
Thus, the contingencies of the short response-initiated 
interval schedules may make them more comparable 
to ratio than interval schedules. 


Fixed-ratio Unit Schedules. Second-order schedules 
with fixed-ratio unit schedules of the form FR (FR:S) 
offer special circumstances for examining the behav- 
ioral properties of the brief stimulus. In such sched- 
ules, food is presented after a fixed number of re- 
sponses, and performance can thus be compared to 
FR schedules with the same total response require- 
ment. The added brief stimulus does not contribute 
additional response or temporal requirements, as with 
fixed-interval unit schedules, or with chained sched- 
ules. 

The facilitatory effects of an added brief stimulus 
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under FR 10 (FR 400:S) in the chimpanzee and 
monkey were described earlier (Findley & Brady, 1965). 
Thomas and Stubbs (1967) found a similar effect with 
pigeons responding under FR 5 (FR 30:S) compared 
with FR 150. 

Lee and Gollub (1971) studied second-order FR 
schedules of the form FR X (FR Y:S), with X-Y = 
256. Values of X equal to 1, 2, 4, 8, 32, and 128 were 
presented in both ascending and descending orders. 
The brief stimulus was a 0.5-sec change in key light 
from red to green. Although 256 responses were re- 
quired for food presentation under all conditions, the 
size of the unit schedule (or, conversely, the number 
of unit schedules and light flashes per food presenta- 
tion) had dramatic effects on responding, as shown in 
Figure 7. The FR 256 schedule is represented by the 
extreme left point of each graph. Note that the high- 
est median overall rates were obtained when the unit 
schedule was FR 64 or FR 128 (corresponding to 4 or 
2 unit schedules per food presentation). Under the 
FR 2 unit schedule responding under the second-order 
schedule occurred at lower or equal rate to that under 
FR 256. 

This parametric study indicates that global gen- 
eralizations based on comparisons between a single 
second-order schedule of brief stimulus presentation 
and a corresponding FR schedule without a_ brief 
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Fig. 7. Median overall response rates under each condition for 
Birds 82 and 80. Each pigeon was studied under FR X (FR Y:S) 
second-order schedules of brief stimulus presentation. A total 
of 256 responses was required for each food presentation such 
that X+Y¥ = 256. The values shown are medians of the median 
rate in the last five sessions. Median rates within each session 
were calculated from a printed record of the time elapsing be- 
tween food reinforcements. Arrows indicate the order in which 
the conditions were varied. (From Lee & Gollub, 1971. © 1971 by 
the Society for the Experimental Analysis of Behavior, Inc.) 
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stimulus may be unwarranted, since performance de- 
pends on the specific scheduling of the brief stimulus. 
Parametric variation of schedule values is clearly 
necessary to determine functional properties in this 
complex situation. 

The effects of different FR unit sizes was also in- 
vestigated by Shull, Guilkey, and Witty (1972). They 
studied FI (FR) schedules in which 10 or 20 key pecks 
turned off the key light for 0.7 sec; pigeons were 
studied under FI 3 min, FI 6-min, and FI 12-min 
schedules, and corresponding second-order schedules. 
The FR unit schedules generated the pause-run re- 
sponse pattern typical of FR schedules of food presen- 
tation. Although Shull et al. (1972) did not present 
quantitative comparisons of rate, cumulative response 
records showed more responses per interval under FR 
10 unit schedules than under simple FI, and more still 
under FR 20. Remarkably, despite these changes in 
rate and temporal organization of responding, the 
percent of the interval to the first response was not 
affected by interval duration, and increased only 
slightly as the behavior required for food presentation 
increased from 1 peck to 20 pecks. This result shows 
the strong temporal control of some features of re- 
sponding by periodic food presentation, as well as 
control by the unit schedule. 

Detailed measurement of responding within FR 
unit schedules shows the same pattern of interre- 
sponse times (IRT) within the unit schedule as that 
shown when food is presented under FR schedules (cf. 
Gott & Weiss, 1972). Kelleher (1966a) studied pecking 
in pigeons under a FI 10-min (FR 20:S) schedule. The 
time to the first peck in each group of 20 pecks was 
generally longer than other IRTs, and approximately 
equal IR'T’s characterized the next 19 pecks. Davison 
(1969), in an experiment with rats, only partly con. 
firmed these results. Many differences in procedure 
between these studies makes a detailed comparison 
impossible. 


Tue ANALYSIS OF THE BEHAVIORAL 
FUNCTIONS OF BRIEF STIMULI 


Two alternative behavioral functions have been 
proposed for the brief stimulus, reinforcing and dis- 
criminative. Compared to responding under identical 
schedules of reinforcement without brief stimulus 
presentation, increases in overall or local rates of a 
response that produce the brief stimulus, as in Kelle- 
her (1966a, b), Findley and Brady (1965), and Lee and 
Gollub (1970), directly implicate a reinforcing func- 
tion. 

The brief stimulus also controls the temporal pat- 
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tern of responding. Unit schedules that terminate in 
a brief stimulus engender patterns similar to equiv- 
alent schedules of food presentation. The brief stim- 
ulus thus serves a discriminative function by control- 
ling responding subsequent to its presentation. 

Experimental analysis of the discriminative and 
reinforcing functions of brief stimuli has been at- 
tempted with several different techniques. The rein- 
forcing effect has been examined with the concurrent 
chains procedure. Schuster (1969) scheduled as the 
terminal components of concurrent chained schedules, 
VI (FR 11:S) and VI, arguing that if S served a rein- 
forcing function, its presentation in only one terminal 
component should enhance the total reinforcing 
effectiveness of that component. Instead, the pigeons 
had slightly higher rates in the initial component that 
did not produce the second-order schedule. ‘This re- 
verse preference may result from the fact that dis- 
parate response rates were generated in the terminal 
components (cf. Gollub, 1970; Moore & Fantino, 1975). 

Another type of analysis of the reinforcing func- 
tion of brief stimuli is related to the process by which 
conditioned reinforcers are established. Kelleher and 
Gollub (1962) argued that a conditioned reinforcer 
was established by simple contiguity of a stimulus 
with an effective reinforcer. (Alternatively, Keller & 
Schoenfeld, 1950, and others, have maintained that 
the conditioned reinforcer must be a discriminative 
stimulus for an operant, and still others have indi- 
cated an informational function, discussed at greater 
length in Chapter 11.) The pairing hypothesis, and 
to a lesser extent the discriminative stimulus hypoth- 
esis, have been tested in a series of studies that 
examined the effect of pairing the brief stimulus with 
food. 


Pairing the Brief Stimulus with Primary Rein- 
forcers. Kelleher (1966b) compared behavior under a 
second-order schedule of a brief stimulus paired with 
food with behavior under a second-order schedule of 
a stimulus not paired with food. In the former case, a 
white light was presented for 0.7 sec under a FR 15 
(FI 4-min:S). Comparison conditions consisted of 
scheduling for 0.7 sec either an unlit key (D) or red 
key light (R) 14 times, with food alone presented after 
the fifteenth interval. In this case S was an unpaired 
stimulus.® ‘The results for one of the three pigeons is 


5It should be noted that the precise notation for the 
schedule of presentation of the unpaired stimulus is tand FR 14 
(FI 4-min:S) FI 4-min. This notation emphasizes the existence 
of two differences between schedules of paired and unpaired 
stimuli: not only is the unpaired stimulus never followed directly 
by food, but food is delivered immediately after a response 
(FI 4-min), versus after a brief delay with a paired stimulus. In 
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shown in Figure 8. Mean response rate in each quar- 
ter of the FI in three replications with white key light 
paired with food are shown by circles. ‘The rates with 
the other (unpaired) dark key are shown in the left 
panel, and those with (unpaired) red key, in the right 
panel. Both overall rates and rate changes within the 
FI period were different under the three conditions. 
Highest overall and terminal rates were obtained 
when the light that was paired with food was pre- 
sented. Lowest mean rates, with little temporal con- 
trol by the FI unit schedule, were obtained when a 
dark key was scheduled. The effects of the red light 
were intermediate: moderate mean rate, with some 
temporal control, although terminal rates in the inter- 
val were not as high as with the paired stimulus. 
One of three pigeons in the experiment responded al- 
most as much with red as with white brief stimuli. 
Although Kelleher (1966b) interpreted these results as 
showing “that it may be necessary to present a 
stimulus in temporal contiguity with a reinforcing 
stimulus if the former stimulus is to become an effec- 
tive conditioned reinforcer” (p. 484), he also indi- 
cated that the specific stimulus used and its previous 
associations with food, as well as the amount of train- 
ing with an unpaired stimulus (here, 17 and 7 sessions, 
respectively) might be important parameters of the 
effect. 

Subsequent experiments have confirmed some 
aspects of these results but have been equivocal about 
others. Over 20 studies have been concerned with the 
pairing operation of brief stimuli in second-order 
schedules. While the majority of comparisons have 
shown that a stimulus paired with food produced 
higher rates, or more pronounced control of rate 
changes by the unit schedule, or both, many of these 
studies have suffered from a serious methodological 
flaw: the stimulus paired with food was a physically 
different event from the unpaired stimulus. In some 
studies the difference was small (e.g., in Kelleher, 
1966b, white key light vs. red key light) and the effect 
could reasonably be attributed to the pairing opera- 
tion. In other studies the difference was considerable 
(presentation of a light in the feeder vs. a light on the 
key, de Lorge, 1969; Stubbs, 1969) and differences in 


some experiments, e.g., Kelleher (1966b), food presentation under 
unpaired conditions is delayed by the duration of the brief 
stimulus, to match the key peck-food relation of the paired 
schedule. Another procedure, which would have certain formal 
advantages over those described previously, would be denoted 
tand FR 14 (FI 4-min:Si) FI 4-min:Sz, where S: and Se refer to 
two different brief stimuli. That is, every interval would termi- 
nate with a brief stimulus, but the stimulus presented under 
the early components of the schedule would be different from 
the stimulus paired with food. 
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Fig. 8. Effects of presenting a stimulus that was not paired with 
primary reinforcement on mean rates of responding in each 
quarter of the fixed interval (Bird 149). Solid circles: white light 
terminated each component; triangles: dark key (left panel) 
or red light (right panel); open circles: final determination of 
points when white light terminated each component. (From 
Kelicher, 1966b. © 1966 by the Society for the Experimental 
Analysis of Behavior, Inc,) 


effect could be due to any number of differences in the 
stimuli such as modality, location, intensity, previous 
exposures, and so on. 

Three experiments reporting differences between 
paired and unpaired stimuli used the same stimulus 
in each condition, presented in different phases of the 
experiment. Hughes (1973) scheduled changes in key 
color and houselight for pigeons; D, W, Zimmerman 
(1969) scheduled a tone and a light for rats under con- 
joint schedules; Byrd (1972) scheduled a light for 
squirrel monkeys under a second-order schedule of 
clectric-shock presentation. In four experiments when 
similar and presumably equivalent stimuli were pre- 
sented, either paired or unpaired with food, higher 
rates were obtained with paired stimuli (de Lorge, 
1967, 1969, 1971; Kelleher, 1966b), 

Six studies reported equivalent effects using the 
same stimulus, both paired and unpaired, though of 
necessity the conditions were studied in separate 
phases of the experiment (Kelleher, 1966b, who found 
equivalent effects in one of three pigeons; Cohen & 
Stubbs, 1976; Stubbs, 1971, who reported 6 experi- 
ments; Hughes, 1973, with bright houselight as S; 
Stubbs & Cohen, 1972; Stubbs & Silverman, 1972, who 
used electric shock as the brief stimulus.) 

What accounts for these conflicting results? A com- 
parison of schedule types and schedule values, unit 
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schedules, deprivation conditions, food presentation 
parameters, prevailing stimuli and briefly presented 
stimuli show extensive overlap among these studies 
(cf. Stubbs, 1971). No single variable distinguishes 
studies in which the pairing operation produced 
differential effects from those in which it did not. 
However, it is possible that a combination of some of 
these variables may be responsible for the differences. 


Physical Properties of the Briefly Presented Stimu- 
lus. Stubbs (1971) demonstrated that the greater the 
number and type of events used as the brief stimulus, 
the greater the effect of the brief stimulus on temporal 
pattern of responding in FI unit schedules. He pre- 
sented either a red key light, a white houselight, both, 
or a blackout, under nonpaired conditions. Greatest 
temporal rate changes were found with key light and 
houselight changes together, and least with blackout 
alone. Hughes (1973) found that a paired stimulus 
gave greater temporal rate changes (scallops) with FI 
unit schedules than did an unpaired stimulus, but 
found equivalent performances with a more intense 
stimulus. An electric shock to the pubis of pigeons 
likewise had identical effects whether paired with food 
or unpaired (Stubbs & Silverman, 1972). Similarly, 
Kello (1972) found greater control of pausing after 
omitted food presentations when a food magazine 
light accompanied blackout than with blackout alone. 

The use of blackout as the brief stimulus has 
yielded varied results. In most experiments, blackout 
has been used as an unpaired stimulus, in the sense 
that food was presented immediately after the ter- 
minal peck under the second-order schedule. It can be 
argued, however, that a blackout is scheduled like a 
simultaneously paired stimulus, in the sense that key 
light and houselight are generally turned off during 
food presentation. This ambiguity in classification is 
possibly related to the inconsistent effects. Kelleher 
(1966b) found little patterning induced by a 0.7 sec 
change from blue key to dark key in an otherwise 
unlit chamber. Stubbs and Cohen (1972) found that a 
2-sec blackout that was scheduled as a paired stimulus 
produced less patterning in FI 48-sec unit schedules 
than key light and houselight changes that were either 
paired or unpaired. On the other hand, Neuringer 
and Chung (1967) got effective control of responding 
with 0.25 to 7-sec blackouts under FR 11 or brief, 
response-initiated FI schedules. The effects of black- 
outs as brief stimuli under second-order schedules may 
thus depend on the unit schedule that is studied (cf. 
Starr & Staddon, 1974). 

A second parameter that determines the effect of a 
brief stimulus is its duration. Byrd (1972) studied 
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second-order schedules in squirrel monkeys in which 
the terminal maintaining event was a brief electric 
shock (cf. Morse & Kelleher, 1970). A blue light was 
scheduled under FR 4 (FI 4-min:S) for 0.1 to 10 sec. 
As duration of the brief stimulus increased, mean 
response rate decreased, with lowest rates early in the 
interval and least variability at the 1l-sec duration. 
Cohen, Hughes, and Stubbs (1973) varied stimulus 
duration from 0.5 to 8 sec on VI 240-sec (FI 48-sec:S). 
There was relatively less responding early in the inter- 
vals under longer durations, and a clear trend of de- 
creasing rate with increasing duration of the stimulus. 
A third effect requires special consideration. 

Cohen, Hughes, and Stubbs (1973) found that in 
both pigeons responding was more positively accel- 
erated within FI units during the second exposure to 
durations of 0.15 and 2 sec than during the first. ‘This 
implies the existence of a partial irreversibility duc 
either to extensive exposure to brief stimulus proce- 
dures or prior exposure to a long value (8 sec). Effects 
in other studies have been reversible. Hughes (1973) 
[ound pairing effects even when the order of presenta- 
tion of paired and nonpaired conditions was con- 
trolled. Thus, history alone may not determine the 
effects of pairing, but history in combination with 
some other variables may be crucial (ct. Marr & Zeiler, 


1974). 


Discriminative Effects of the Brief Stimulus. Under 
second-order schedules with FI and FR unit schedules, 
the brief stimulus can be expected to serve a discrim- 
inative function: responses immediately after it are 
never reinforced. The stimulus thus controls a low 
or zero rate after its occurrence, as does food in a FI or 
FR schedule with primary reinforcement (cf. Staddon, 
1974). Paired and unpaired stimuli are equivalent 
with respect to nonreinforcement of immediately sub- 
sequent responses. 

In the balance of response-dependent rate-enhanc- 
ing effects and response-independent rate-decreasing 
effects of brief stimuli, the latter are often predomi- 
nant in the stable state. The former are often stronger 
either early in training, or when the stimuli are less 
intense and therefore less effective as discriminative 
stimuli. In fact, all the variables that are relevant to 
the development of stimulus control (stimulus salience 
as determined by modality, intensity, and previous 
training conditions) would be expected to affect the 
control by scheduled brief stimulus presentations (cf. 
Starr & Staddon, 1974). 

Two recent experiments have shown changes in 
control by brief stimuli with repeated training. Marr 
and Zeiler (1974) first scheduled a brief stimulus un- 
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paired with food under four procedures. Its presenta- 
tions had minimal effect. After six procedures in 
which the stimulus was paired with food presentation, 
it was again scheduled unpaired with food. It now 
controlled responding nearly as well as when it was 
paired. Thomas and Blackman (1974) compared VI 
66-sec (FI 10-sec:S) and VI 66-sec (VI 10-sec:S), where 
S was a change in key light from white to red for 3 
sec, with comparable tandem schedules. Response 
rates were higher under both schedules of brief stim- 
ulus presentation, but pauses after the stimulus oc- 
curred only with the FI unit schedule. This result is 
consistent with the fact that food could occur at any 
time after the stimulus under the VI, whereas it could 
not occur for 10 sec under the FI. 

The critical analysis of the behavioral function of 
the brief stimulus resembles the inconclusive debates 
that characterized the analysis of conditioned rein- 
forcement in extinction procedures. Although dis- 
criminative effects were demonstrated in old response 
procedures, there has been no agreement on whether 
response acquisition on new response procedures can 
be ascribed unambiguously to conditioned reinforce- 
ment (Wike, 1969, but see Schuster, 1969: Longstreth, 
1971). 

Second-order schedules of brief stimulus presenta- 
tion were developed, among other reasons, “to deter- 
mine whether a brief stimulus that was occasionally 
contiguous with food delivery would maintain re- 
sponding” (Kelleher, 1966b), Such a possibility would 
permit chronic investigation of conditioned reinforce- 
ment, since food presentation is continued in these 
schedules. The possibility that at least some rate in- 
creases under second-order schedules arise from dis- 
criminative effects of the brief stimulus (Fantino, 
Chapter 11 of this volume) also challenges an explana- 
tion in terms of reinforcement. Thomas and 8lack- 
man (1974) concluded that it was “difficult to see now 
the reinforcement hypothesis is to be differentiated 
from the discriminative hypothesis in terms of the 
performances maintained by brief stimuli in second- 
order schedules. Both hypotheses suggest that the way 
in which brief stimuli are scheduled may be crucial 
and both hypotheses are consistent with data” (p. 
105). 


BRriEF STIMULI SCHEDULED 
DuRING EXTINCTION 


One of the classic methods for testing the effective- 
ness of a stimulus as a conditioned reinforcer is to 
make its presentation dependent on a response in the 
absence of food (an extinction test). Because the stim- 
ulus was no longer paired with food, its effects even- 
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tually disappeared, although intermittent scheduling 
during the training phase could extend its effectiveness 
(D. W. Zimmerman, 1959). Both Zimmerman (1959) 
and Kelleher (1961) showed that the schedule under 
which the brief stimulus was produced, with food 
completely omitted, determined rate and temporal 
pattern of responding. 

Several additional experiments have extended this 
paradigm. Thomas (1969) prolonged the effect of a 
briefly presented stimulus complex associated with 
food delivery. He scheduled brief presentations in the 
presence of one key color and longer presentations in 
the presence of a second color. The two colors alter- 
nated during sessions. For example, when a triangle 
was projected on the key, food was presented for 4 
sec under FR 120, When the key was green, food was 
presented for 0.3 sec, a duration too short to allow eat- 
ing. Substantial amounts of responding occurred when 
the key was green, with pauses after cach presentation 
that wéré shortér than the pauses when pecks produced 
accessible food. When: responses during green had no 
effect (EXT) or when they produced a 0.3 sec change 
in key color to red, a stimulus that was not associated 
with food presentation, very little responding was sus- 
tained. When 4 sec of food access was seheduled only 
when no pecks had occurred for 20 sec (DRO 20-scc) 
and brief (0.3 sec) operations of the feeder occurred 
on every tenth peck (FR 10), the pigeon responded 
faster when not receiving food than when it did. 
‘Thomas and Johansen (1970) oktained similar effects 
with a key-light change as the brief stimulus, rather 
than stimuli intimately associated with food. Thus, in 
both these experiments, responding was maintained 
by presentation 46f stimuli paired with food in the 


resence of a key color that was never directly asso- 


ciated with food (cf, Herrnstein & Loveland, 1972), 


Token Reinforcement. A token is a small physical 
object that is delivered to an organism under some 
schedule. In the presence of another stimulus, a 
specified response involving the token, such as insert- 
ing 1t into a receptacle, is followed by presentation of 
food, juice, etc. ‘This procedure resembles both 
chained schedules and second-order schedules of brief- 
stimulus presentation. The schedule of token delivery 
is the unit schedule, and the schedule for exchange 
specifies when behavior under the unit schedule is 
followed by food. The accumulating number of 
tokens is also a stimulus that is correlated with the 
possibility of food presentation, especially when food 
is scheduled after a given number of tokens. The 
number of accumulated tokens could have effects 
similar to those of chained schedule (or clock and 
counter) component stimuli. 
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In earlier experiments (Kelleher, 1957a,b, 1958) 
chimpanzees received poker chips which later were 
exchanged for foods and liquids. Similar procedures 
have recently been developed for use with rats (Mala- 
godi, 1967 a,b; Waddell, Leander, Webbe, & Mala- 
godi, 1972). As in second-order brief-stimulus sched- 
ules, the schedule of token delivery engenders 
consistent patterns of responding. Substantial rates are 
sustained at times considerably removed from food 
presentation. As in chained schedules, however, ex- 
tended pauses occur when the stimulus conditions 
(i.e., number of tokens) are those correlated with zero 
probability of food presentation. Responding can be 
instated almost immediately by presenting, indepen- 
dently of responding, a large number of tokens 
(Kelleher, 1958). The stimulus conditions then re- 
semble those typically prevailing closer to the end of 
the schedule sequence. 

Malagodi (1967a,b) developed token reinforce- 
ment procedures with rats. Bar presses produced glass 
marbles under FR or VI schedules. According to an 
exchange schedule, in the presence of a characteristic 
stimulus, insertion of cach marble into a receptacle 
was followed by delivery of a food pellet. Moderately 
low rates were maintained when food was available 
after 2 or 10 marbles had been delivered, a FR (VI) 
schedule. When a marble was produced under FR 20, 
a brief pause followed by high rates of bar pressing 
followed each marble delivery. Waddell et al. (1972) 
also scheduled marble delivery under a FR 20 unit 
schedule; the opportunity for food-reinforced marble 
insertion followed a marble delivery at a fixed time 
after the last food delivery, under a FI (FR) schedule. 
Even though every marble had always been exchanged 
for one food pellet, as the FI duration increased, the 
response rate, and therefore the number of marbles 
delivered, decreased. Cumulative records of respond- 
ing under FI values of 1.5 min, 4.5 min, and 9.0 min 
show high local rates alternating with pauses after 
each marble delivery or food presentation, with longer 
pauses typically following the latter. Overall response 
rate between food presentations had an increasing 
trend, so that responding showed effects of both FI 
and FR schedules. A direct comparison of correspond- 
ing chained schedules, schedules of brief stimulus 
presentation, schedules of token delivery, and tandem 
schedules should be informative. 


CONCURRENT AND CONJOINT SCHEDULES 
OF BRIEF STIMULUS PRESENTATION 


The analysis of the role of the added stimulus in 
second-order schedules of brief stimulus presentation 
is complicated because a fixed relationship of the 
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stimulus to food presentation may establish discrim- 
inative control. Reinforcing effects of a stimulus 
paired with food may be revealed less ambiguously in 
concurrent and conjoint schedules of brief stimulus 
and food presentation in which the temporal relation 
of the stimulus to food is irregular, since the two 
events are scheduled independently. When the sched- 
ules are associated with different operanda, they will 
be called concurrent schedules, and when they are 
associated with a single operandum, conjoint (Catania, 
1968). 

Sustained responding by brief stimulus presenta- 
tions that are scheduled concurrently with food has 
been demonstrated under several conditions. In one 
group of experiments, by J. Zimmerman and his col- 
leagues, pecking on one key produced for 0.5 sec the 
same stimulus complex that ordinarily accompanied 
food delivery: the key became dark, the houselight 
was turned off, a light in the feeder compartment was 
turned on, and a solenoid operated and raised a tray 
of food. A mechanical shutter covered the feeder open- 
ing in later experiments, preventing access to food. 
Accessible food, presented for 3 to 4 sec, was either 
scheduled concurrently, for pecks on a second key, or 
conjointly, under a variable-time (VT) schedule. In 
most of the experiments pecks on the key that pro- 
duced the brief stimulus postponed the delivery of 
accessible food for 6 sec. This delay presumably 
attenuated the direct effects of food on pecking. 

A series of five publications showed that rates of 
pecking between 3 and 10 min were maintained by the 
0.5 sec magazine stimulus complex. J. Zimmerman 
and Hanford (1966) found considerably higher rates 
in the presence of a blue key light where pecks pro- 
duced the food-paired stimulus complex on FI I-min 
than in a yellow key light, where pecks had no conse- 
quences. Accessible food was presented independently 
of pecking under a VT 3-min schedule. J. Zimmer- 
man, Hanford, and Brown (1967) found that the rate 
of pecking increased as the frequency of pecks pro- 
ducing the food-paired stimulus increased. 

The food-paired stimulus also sustains responding 
for considerable periods of time after its pairing with 
food is eliminated. J. Zimmerman (1969) found that 
pecking was maintained in two pigeons by presenta- 
tion of the brief stimulus for 24 and 32 50-min ses- 
sions. J. Zimmerman and Hanford (1967) found 
similar maintenance for as long as 16 sessions. 
Together with ‘Thomas’s (1969) and Kelleher’s (1961) 
demonstrations of extensive maintenance of respond- 
ing by food magazine stimuli, these results indicate 
that stimuli with extensive histories of contiguous 
association with food gain powerful maintaining con- 
trol when presented as behavioral consequences. 
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In addition to the parametric investigation of fre- 
quency of presentation of brief stimuli, the effects of 
different schedules on temporal pattern of responding 
have been investigated. The schedules used include 
FI, VI, FR, DRL, and extinction. In general, the 
temporal pattern under each schedule resembled that 
typically generated by the same schedule of accessible 
food, although at considerably lower rates. Thus, 
schedules of response-produced stimuli paired with 
food appear to engender responding in a similar fash- 
ion to that maintained by traditional positive rein- 
forcers, such as food and water. 

Response maintenance under concurrent and con- 
joint schedules of brief stimulus presentation could 
be due to a direct effect of delayed presentation of 
accessible food. ‘Iwo types of results argue against this 
possibility, however. First, multiple schedules of brief 
stimulus presentation control appropriate response 
rates. If pecks in the presence of one key color do not 
produce the brief stimulus while pecks in a different 
key color do (J. Zimmerman, 1963; J. Zimmerman & 
Hanford, 1966), the latter stimulus controls higher 
rates, even though responses in the presence of both 
colors are followed by delayed food. Moreover, this 
effect readily reverses when the correlation of schedule 
and stimulus is reversed (J. Zimmerman & Hanford, 
1967; Hamm and Zimmerman, 1972). 

Second, the food-paired brief stimulus complex 
maintains higher rates than a nonpaired stimulus 
(Hamm & Zimmerman, 1972; J. Zimmerman & Han- 
ford, 1966, 1967). ‘This comparison is compromised, 
however, because the nonpaired stimulus differed in 
many ways from the food-paired stimulus. A compari- 
son in which both stimuli were arbitrary would per- 
mit a less ambiguous test of the role of association 
with food. 

Other experiments have opposed the effects of food 
presentation under one schedule with brief-stimulus 
presentation under another schedule presented con- 
jointly. The behavior controlled by the conjointly 
presented brief stimulus is chosen such that it reduces 
the frequency of food presentation. Randolph and 
Sewell (1965) scheduled food under DRL 20-sec and 
DRL 30-sec, and conjointly scheduled brief presenta- 
tion of the feeder light and offset of key and house- 
light under FR 10. Response rate under the conjoint 
schedule was higher with brief stimulus presentations 
than without it, especially on the DRL 30-sec sched- 
ule. Stubbs (1967) also found greater rate increases 
with longer DRL values. Similarly, when Clark and 
Sherman (1970) arranged for non-matching-to-sample 
responses to produce a food-paired stimulus, the fre- 
quency of matching decreased by 30-40%, even 
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though only matches produced food. Responses early 
in the fixed-interval period were also increased when 
bar pressing by rats produced a stimulus paired with 
water under a VI schedule (D. W. Zimmerman, 1969), 
or under a shorter FI than the one that arranged 
water delivery (D. W. Zimmerman, 1971). When the 
brief stimulus was presented independently of re- 
sponding under a VT schedule, only small changes in 
responding were observed (D. W. Zimmerman, 1971). 

As previously mentioned, some experiments on con- 
joint schedules have found minimal effects of brief 
stimuli, Neuringer and Chung (1967) found no rate 
increases with conjoint FR 10:S VI 1-min:food. 
Similarly, a 0.7-sec blackout paired with food en- 
hanced responding only slightly under conjoint FR:S 
FI:food schedules according to Shull, Guilkey, and 
Witty (1972). Stubbs (1971) also found no significant 
effects of a briefstimulus unpaired with food pre- 
sented under FI 60-sec, with food presented under VI 
240 sec. A second-order schedule with the same compo- 
nents engendered substantial patterned responding 
whether the brief stimulus was paired or unpaired 
with food. 

The results of experiments on concurrent and con- 
joint schedules of bricfstimulus presentation indicate 
that stimuli that precede and/or accompany food 
presentation can often increase the rate of responses 
that produce them. Although the effects with concur- 
rent schedules are often quantitatively small, they are 
sustained over long periods of time and can continue 
when food is no longer available. 

One inference from these experiments has impor- 
tant implications for the further study of rate enhanc- 
ing affects Sf briéf stimuls. It appears that such effects 
can best be obtained when the response that produces 
the brief stimulus is under minimal control by food 
reinforcement. Thus, the clearest sustained effects are 
obtained when the response never produces food, as in 
the concurrent schedules, or is less strongly controlled 
by the schedule of food reinforcement, as in long 
DRL schedules or early in the fixed-interval period. 
Stated another way, the effects of the brief stimulus 
can be masked by the effects of food reinforcement, or 
by ongoing high response rates. When food delivery 
controls responding rather strongly, only discrimina- 
tive effects of lower rate after the stimulus may ap- 
pear. This seems the most frequent result in the 
majority of studies on second-order schedules, and it is 
an important consideration for experiments designed 
to test whether a stimulus paired with food delivery is 
a more effective reinforcer than one that is unpaired 
(cf. Cohen & Stubbs, 1976). 
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CONDITIONED REINFORCEMENT IN 
SECOND-ORDER SCHEDULES OF BRIEF 
STIMULUS PRESENTATION 


The preceding review has shown that second-order 
schedules of brief stimulus presentation engender con- 
sistent and well maintained patterns of responding. 
Such patterns depend on the presentation of brief 
stimuli, but may reflect primarily their discriminative 
effects due to a consistent relation of the stimulus to 
subsequent food presentation. Whether a reinforcing 
(rate enhancing) effect is obtained depends on the 
baseline performance and schedule for presenting the 
brief stimulus. Response maintenance or enhancement 
can be observed most reliably during initial exposure 
to certain schedules of stimulus presentation. Since 
the focus of many experiments with these procedures 
has been steady state performance, some of these 
experiments provide poor evidence for a maintained 
conditioned reinforcing effect. When the relation of 
the stimulus to food is less regular, or when control of 
the response by food presentation is weak, as in the 
presence of stimuli that control low rates, rate en- 
hancing effects can be observed, and are sustained. 
Striking instances are during extinction (Dhomas, 
1969), or when a concurrent or conjoint response does 
not produce food (J. Zimmerman, 1963, and related 
papers), and during periods of low rate in FI (D. W. 
Zimmerman, 1971) and FR schedules (Findley & 
Brady, 1965), 


CONCLUDING REMARKS 


The concept of conditioned reinforcement has been 
used as both an explanatory term in analyses of oper- 
ant behavior in general, and as a concept to organize 
a variety of behavioral procedures. A reflex orienta- 
tion which emphasized close temporal relationships 
predisposed early workers to interpret response acqui- 
sition or maintenance with delayed or intermittent 
reinforcement in terms of immediate consequences of 
behavior. Experimental paradigms were arranged to 
demonstrate these effects. Stimuli associated with effec- 
tive reinforcers (e.g., food) were shown to prolong per- 
formance when primary reinforcers were omitted, to 
reduce the decrement caused by a delay between be- 
havior and reinforcer, or even to serve as the sole 
consequence in acquisition of a new operant (cf. Mil- 
ler, 1951; Myers, 1958). More detailed analyses have 
revealed, however, that these stimuli can also be serv- 
ing discriminative function. 

More recent results with chained schedules and 
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second-order schedules of brief-stimulus presentation 
also reveal strong discriminative effects of putative 
conditioned reinforcers. This has led some writers to 
be extremely skeptical about a rate-enhancing func- 
tion of such stimuli (e.g., Schuster, 1969; Longstreth, 
1971). 

It is hardly a novel discovery, however, that stimuli 
have multiple functions (cf. Skinner, 1938). Different 
functions may be revealed differentially by various 
experimental procedures or under parametric manip- 
ulation. The fact that a behavioral effect depends on 
the schedule of presentation of a stimulus is not 
unique to conditioned reinforcers. The behavioral 
effects of drugs similarly depend on schedules and 
baseline performance (Dews, 1955; Kelleher & Morse, 
1968). 

The concept of conditioned reinforcement has 
played an important role in the development of prac- 
tical techniques of behavioral control (Ayllon & Azrin, 
1968; O'Leary & Drabman, 1971). The detailed anal- 
ysis of some of these procedures reveals that relevant 
stimuli have other, and sometimes more important 
effects than response enhancement. Such results 
should not, however, lead us to neglect rate-enhancing 
effects under appropriate conditions. It would in- 
deed be ironic if, at the same time that use of con- 
ditioned reinforcement techniques becomes a com- 
monplace in the applied analysis of human behavior 
the results of limited experimental paradigms were 
taken as disproving the existence of conditioned rein- 
forcement. The trends in experimental analysis re- 
viewed here indicate that schedules can greatly mod- 
ulate conditioned reinforcing effects. 
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Il 


Conditioned Reinforcement 


choice and information’ 


INTRODUCTION 


Three Conceptions of Conditioned Reinforcement 


Kelleher and Gollub (1962) and Kelleher (1966) 
have reviewed the important techniques that have 
been used to study conditioned reinforcement. 
Chained schedules and other second-order schedules 
occupy large portions of these reviews, and their con- 
tinuing importance in the study of conditioned rein- 
forcement has again been emphasized in the preceding 
chapter by Gollub. Over the past decade two addi- 
tional techniques have become increasingly popular 
and important in the assessment of conditioned rein- 
forcement: the study of observing responses (after 
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thank the members of my seminar on conditioned reinforcement 
early in 1974: Steve Buck, Mark Fridovich, John Hale, Cheryl 
Logan, Jay Moore, Jim Norborg, Nancy Squires, and especially 
Tibor Safar. 
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Wyckoff, 1952, 1969), which has evaluated the rein- 
forcing strength of stimuli that signal the availability 
of, or provide information about, primary reinforce- 
ment; and the study of choice for stimuli associated 
with schedules of primary reinforcement (after Autor, 
1960, 1969). In this chapter I discuss research with 
these techniques and the important implications of 
this research for the theory of conditioned reinforce- 
ment. I begin by introducing the theories and dis- 
tinguishing among them on conceptual grounds and 
in terms of empirical predictions. The three most 
viable conceptions of how a neutral stimulus acquires 
strength based on its relationship to primary reinforce- 
ment appear to be the following: (1) the pairing 
hypothesis, which states that the simple pairing of a 
stimulus with a primary reinforcer imparts condi- 
tioned reinforcing strength to that stimulus; (2) the 
delay reduction hypothesis, which states that the 
strength of a stimulus as a conditioned reinforcer is a 
function of the reduction in time to reinforcement 
correlated with the onset of that stimulus; (3) the un- 
certainty reduction hypothesis, which states that the 
strength of a stimulus is a function of its informative- 
ness about primary reinforcement, i.e., how much un- 
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certainty reduction it provides about reinforcement.? 

When applied to observing responses, the pairing 
and delay reduction hypotheses are both forms of a 
more general hypothesis, the conditioned reinforce- 
ment hypothesis, which may be most clearly distin- 
guished from the uncertainty reduction hypothesis by 
the assertion that only stimuli having positive associa- 
tions with primary reinforcement should reinforce 
observing responses (Dinsmoor, Browne, & Lawrence, 
1972). The uncertainty reduction hypothesis, on the 
other hand, states that stimuli associated with negative 
outcomes should also be reinforcing; the conditioned 
reinforcement hypothesis requires that these stimuli 
be aversive—or at least not positively reinforcing. 
Much of the work on observing responses concerns 
these opposing prédictions. As we shall see, virtually 
all of the evidence supports the conditioned reinforce: 
ment hypothesis. 

The pairing and delay reduction versions of the 
conditioned reinforcing hypothesis may be distin- 
guished in terms of temporal factors relating the con- 
ditioncd reinforcing stimulus and the primary rein- 
forcer. According to the pairing hypothesis, the degree 
of contiguity between the stimulus and the primary 
reinforcer determines the strength of that stimulus as 
a conditioned reinforcer. Contiguity has been mea- 
sured often as the interval between the offset of the 
stimulus and the onset of the primary reinforcer. By 
this measure any stimulus that is perfectly contiguous 
with the primary reinforcer—i.é., with a O-see interval 
between the offset of the stimulus and the onset of the 
reinforcer—should be maximally effective as a con- 
ditioned reinforcer. As many studies, including those 
of observing and choice, suggest, a stimulus associated 
with 4 higher rate of primary reinforcement—as in 
a fixed-interval (ET) 10-sec schedule—is generally a 
more effective conditioned reinforcer than one asso- 
ciated with a lower rate (as in an FI 60-sec schedule), 
despite the fact that both stimuli are perfectly con- 
tiguous with the primary reinforcer in the sense noted. 
Thus a pairing hypothesis based on this view of con- 
tiguity, henceforth the traditional paving hypothesis, 
is inadequate. A more viable measure of contiguity in 
these cases, therefore, is the interval between the onset 
of the stimulus and the onset of the primary rein- 
forcer. This measure (time/reinforcement) is closely 


1‘Two of the more important traditional hypotheses which 
have proven less viable than the pairing hypothesis are the 
discriminative stimulus hypothesis (Keller & Schoenfeld, 1950) 
and the cue strength hypothesis (Wyckoff, 1959). These have been 
ably reviewed in a previous volume (Kelleher, 1966) and will 
not be discussed here. The discriminative stimulus hypothesis 
has also been reviewed in the preceding chapter. 
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related to the rate of reinforcement in the presence of 
the stimulus. More generally, a pairing view will fare 
best if it is couched in terms of reinforcement density 
(i.e, number of reinforcements/unit time) in the 
presence of a given stimulus. A pairing hypothesis 
based on reinforcement density states that conditioned 
reinforcing strength will be determined by the rate of 
primary reinforcement in its presence. Such a pairing 
hypothesis, which bears only superficial similarity to 
the traditional pairing hypothesis, I shall call the 
reinforcement density nypothesis.? 

The delay reduction hypothesis also states that the 
reinforcing strength of a stimulus is determined, in 
part, by the leneth of the interval between the onset 
of the stimulus and the onset of the primary rein- 
forcer. But this interval length must be considered 
relative to the length of the interval measured from 
the onset of the preceding stimulus to the onset of the 
same primary reinforcer. In other words, the contribu- 
tion of contiguity to the conditioned reinforcing 
strength of a stimulus must be considered in the con- 
text of how remote primary reinforcement had been 
prior to the onset of the stimulus. The greater the per- 
centage improvement, in terms of contiguity, to 
primary reinforcement correlated with the onset of 
the stimulus, the greater its conditioned reinforcing 
strength, ‘hus a stimulus associated with an FI 30-sec 
schedule should be a stronger reinforcer if it is pre- 
ceded by a 60-sec period of nonreinforcement than if 
it 1g preceded by 4 10-sée period of nonreintorcement, 
since in the first case the onset of the 30-sec interval is 
correlated with a 24 reduction in time to primary 
reinforcement (of an original waiting time of 90 sec, 
only 50 sec—or 14—remains once the stimulus corre- 
Jated with the interval schedule appears), but in the 
second case only with a 14 reduction in time to pri- 
mary reinforcement (of an original waiting time of 40 
sec, 30 sec—or 34—still remains once the stimulus 
correlated with the interval schedule appears). Neither 
the traditional pairing hypothesis nor the reinforce- 
ment density hypothesis distinguishes between these 
two cases. While there is no direct evidence bearing 
on this prediction, the results of ‘Taus and Hearst 
(1970) and of Byrd (1971) suggest that the delay reduc- 
tion hypothesis would be supported. ‘The data from 
each of these studies show that the discriminative 
strength of a stimulus (in terms of rate of responding 
in its presence) increases with the duration of a pre- 
ceding period of nonreinforcement even though the 
temporal relation of the stimulus to reinforcement is 
unchanged. Since the conditioned reinforcing strength 


2 A phrase suggested by Dr. James Dinsmoor. 
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of a stimulus often covaries with its discriminative 
strength, the prediction made above by the delay re- 
duction hypothesis would likely be borne out. 

I should make clear that none of the hypotheses re- 
quires that production of the stimuli affect the occur- 
rence of reinforcement. In terms of the delay reduc- 
tion hypothesis, for example, a stimulus correlated 
with a reduction in time to primary reinforcement 
should be a conditioned reinforcer, 1.e., it should 
maintain responses (such as observing or choice re- 
sponses) whether or not these responses affect the 
temporal distribution of reinforcement. 

The delay reduction hypothesis will be developed 
more fully as we go on. Research on observing re- 
sponses is equally compatible with both the delay 
reduction and the reinforcement density hypotheses. 
As we shall see, however, research on choice clearly 
favors the delay reduction hypothesis. Thus, for 
simplicity, J shall stress the delay reduction hypothesis 
when discussing observing, though the reader should 
be aware that similar experimental outcomes are re- 
quired by the reinforcement density hypothesis. Re- 
search on observing responses shows that the uncer- 
tainty reduction hypothesis is untenable. Thus it will 
be seen that only the delay reduction hypothesis of 
conditioned reinforcement is consistent with what 1s 
known about observing and choice. 


The Pairing Hypothesis 


The pairing hypothesis, which is more parsimo- 
nious than either the delay reduction or uncertainty 
reduction hypotheses, even when formulated in terms 
of reinforcement density, has a rich history. We shall 
consider it and some of its inadequacies before dis- 
cussing the work on observing and on choice, at which 
time the two newer hypotheses will be more fully 
developed and evaluated. As the reviews of Kelleher 
and Gollub (1962), Kelleher (1966), and Nevin (1973) 
have concluded, the pairing hypothesis (after Hull, 
1943) is the most viable of the traditional viewpoints 
of conditioned reinforcement. By the time of Nevin’s 
(1973) review, however, it was clear that the pairing 
hypothesis had serious shortcomings, as Nevin himself 
pointed out. Some of these shortcomings also apply to 
the reinforcement density version of the pairing 
hypothesis—as I shall note below. 

The pairing hypothesis is supported by the com- 
mon observation that a stimulus paired with uncondi- 
tioned reinforcement acquires the properties of a rein- 
forcer (e.g., Nevin, 1973). It now appears that pairings 
or contiguity is effective only so long as a correlation 
exists between the stimulus and reinforcer. For 
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example, Rescorla (1967, 1968, 1972) has shown that 
in Pavlovian fear conditioning when the probability 
of an unconditioned stimulus (US) in the presence 
of a conditioned stimulus (CS) is held constant, the 
degree of conditioned suppression may be sharply in- 
fluenced by manipulating the probability of the US 
in the absence of the CS; when the probability of a 
US presentation is equal in both the presence and 
absence of the CS, no suppression occurs despite the 
fact that the number of pairings is kept constant. 
‘This result points to a conclusion that will gain sup- 
port throughout this chapter whether we are discuss- 
ing stimulus-reinforcer pairings, stimulus-reinforcer 
correlations, or other stimulus-reinforcer relations: 
The context in which stimulus-reinforcer events are 
embedded affects the strength imparted to the stim- 
ulus by the reinforcer. For example, in both the ob- 
serving response and concurrent chains paradigms we 
shall see that a stimulus will reinforce behavior (ob- 
serving responses in the first case, choice in the second) 
only when it is correlated with a reduction in the 
average time to primary reinforcement, regardless of 
the absolute temporal relation between the stimulus 
and primary reinforcement. 

The most striking evidence for the pairing hypoth- 
esis has come from studies investigating three types 
of second-order schedules, considered in the previous 
chapter: tandem, brief-stimulus, and chained sched- 
ules. On a tandem schedule, the same stimulus is 
present throughout. A brief-stimulus schedule is the 
same as a tandem schedule, except that the end of 
each component is signaled by the brief presentation 
of a second stimulus. Each component of a chained 
schedule is associated with a different stimulus. The 
differences in the responding maintained by these 
three types of second-order schedules have been ex- 
plained by the conditioned reinforcing properties of 
the brief stimuli or the stimuli comprising the chained 
schedule (e.g., Kelleher, 1966). According to this view, 
the brief-stimulus presentations occurring at the end 
of each schedule component are effective conditioned 
reinforcers because they are intermittently paired with 
primary reinforcement. Similarly, the terminal-link 
stimulus of a chained schedule is an effective condi- 
tioned reinforcer because it is contiguous with pri- 
mary reinforcement. Studies of second-order schedules 
have demonstrated increments in responding relative 
to that maintained on tandem control schedules. It 
appeared, then, that pairing sufficed to create effective 
conditioned reinforcers. 

The relevance of the second-order schedule data for 
the pairing hypothesis of conditioned reinforcement 
(and indeed for conditioned reinforcement in general) 
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has been called into question more recently by the 
experiments of Stubbs (1971), Stubbs and Cohen 
(1972), and Squires, Norborg, and Fantino (1975). 
Stubbs showed that brief stimuli presented at the end 
of each component of a second-order schedule en- 
hanced responding even when they were always omit- 
ted at the end of the component preceding primary 
reinforcement. Despite the fact that these brief stimuli 
were never paired with primary reinforcement, they 
were just as effective in maintaining behavior as were 
paired brief stimuli. This result is also inconsistent 
with the reinforcement density version of the pairing 
hypothesis since primary reinforcement occurs fre- 
quently, 1.e., with high density, in the presence of the 
paired but not the unpaired brief stimulus. In 
Stubbs’s experiment the pairing procedure consisted 
of the simultaneous pairing of the brief stimulus and 
food. Stubbs and Cohen (1972) found comparable re- 
sults with pairing procedures in which the brief stim- 
ulus preceded food. Specifically, they found similar 
behavioral effects with (1) a simultaneous pairing pro- 
cedure, (2) a procedure in which the brief stimulus 
preceded and overlapped food, and (3) a procedure in 
which the brief stimulus preceded but did not overlap 
food. 

Squires et al. (1975) tested the following alternative 
explanation of the difference in responding main- 
tained by tandem, chain, and brief-stimulus schedules: 
Stimuli which signal the relative unavailability of 
reinforcement suppress responding in the early compo- 
nents of second-order schedules in which primary rein- 
forcement is never available. Thus responding in the 
early portions of a tandem schedule is better main- 
tained than responding on a comparable chain sched- 
ule because the visual stimuli in the chain are more 
effective cues for nonreinforcement (and hence nonre- 
sponding) than are the temporal cues present in the 
tandem schedule. On the other hand, brief-stimulus 
presentations should be reliable cues for nonreinforce- 
ment in the subsequent component only if the subject 
is sensitive to the number of brief-stimulus presenta- 
tions that have occurred since the previous primary 
reinforcement. Since number may be a much less effec- 
tive cue than color, at least in pigeons, Squires et al. 
(1975) reasoned that a brief-stimulus schedule may be 
one in which pigeons have difficulty discriminating 
among the components of the schedule, making the 
brief-stimulus schedule functionally a schedule of rein- 
forcement in which each component is reinforced a 
certain percentage of the time (this position has much 
commonality with that of Neuringer & Chung, 1967). 

In order to test this proposition, Squires et al. ex- 
posed pigeons to a series of second-order schedules in 
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which the completion of a fixed number of FI compo- 
nents (on the “main” response key) was required for 
primary reinforcement. For example, in Experiment 
1, brief (2-sec) stimulus presentations on a second key 
(the “brief-stimulus key’) followed the completion of 
each FI component (on the main key). In addition, 
during the final brief-stimulus presentation preceding 
primary reinforcement, a response was required on the 
second key in order to produce primary reinforce- 
ment. Prior to the end of the final component, re- 
sponses to the brief-stimulus key had no consequences. 
Such responding did serve as a measure of the extent 
to which the brief-stimulus components were discrim- 
inated from one another. Squires et al. examined be- 
havior on the following second-order schedules: FR 1 
(FI 120-sec), FR 2 (FI 60-sec), FR 4 (FI 30-sec) FR 8 
(FI 15-sec). They found that responding occurred on 
the brief-stimulus key on virtually every brief-stimulus 
presentation. As Figure 1 shows, even when inappro- 
priate responding on the brief-stimulus key (i.e., re- 
sponses to the brief stimulus prior to the final com- 
ponent) produced a 15-sec blackout and returned the 
subject to the beginning of the second-order schedule, 
none of the pigeons learned to withhold these re- 
sponses; consequently, subjects received food only 
rarely in this condition. Results were different when 
different key colors were associated with each compo- 
nent of the second-order schedule. In such a chain 
schedule, brief-stimulus key pecks were confined to the 
Jast component, i.e., to the only component in which 
they were ettective. 

These results suggest that pigeons do not discrim- 
inate between the components of second-order sched- 
ules in the absence of differential cues. Squires et al. 
(1975, p. 170) note: 


Another possibility is that the mechanism under- 
lying conditioned reinforcement is stimulus gen- 
eralization, so that the more similar are the 
conditioned and primary reinforcers, the more 
effective will be the conditioned reinforcer. Since 
the paired and unpaired stimuli were identical 
in their similarity to primary reinforcement, 
their effects should have been the same. When 
the paired stimulus more closely resembles the 
primary reinforcer than does the unpaired 
stimulus it will, by this hypothesis, differentially 
enhance response rates. In support of this hypoth- 
esis, a study by Malagodi, DeWeese, and Johnston 
(1973) demonstrated the clear superiority of 
paired brief stimuli over unpaired brief stimuli 
(each added at the end of each component of a 
chain schedule as in our Experiment 3) when the 
paired brief stimuli were brief hopper presenta- 
tions. J. Zimmerman and his co-workers (Zim- 
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Fig. 1. Probability of a brief-stimulus key response during the 
first brief-stimulus presentation following primary reinforce- 
ment for the last three sessions of base line (no punishment), 20 
sessions in which those responses were followed by a 15-sec 
blackout and return to the start of the first component (‘‘pun- 
ishment”’), 10 sessions in which the brief-stimulus key was not 
illuminated during the usual time of the first brief-stimulus 
presentation, and 10 sessions more of the punishment condition. 
Data for each of four pigeons. (From Squires et al., 1975. © 
1975 by the Society for the Experimental Analysis of Behavior, 
Inc.) 


merman, 1969; Zimmerman and Hanford, 1966, 
1968) have also maintained considerable respond- 
ing when consequence of responding was the pro- 
duction of a short hopper presentation (too short 
to allow eating) and delay of longer hopper pre- 
sentations. The apparent superiority of a brief 
hopper presentation to other paired brief stimuli 
may be attributed to its resemblance to the pri- 
mary reinforcer [i.e., presentation of grain]. This 
effectiveness may either be due to the consequent 
conditioned reinforcing effects of those stimuli, 
or due to the failure to discriminate whether the 
hopper presentations will be short or [sufficiently] 
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long [to permit ingestion of grain]. The latter ex- 
planation would be similar to the failure-to-dis- 
criminate hypothesis suggested above. . . . The 
differences in these two explanations (conditioned 
reinforcement vs generalization) is crucial because 
the latter explanation obviates any need for a 
separate conditioned reinforcement concept un- 
der these circumstances. ‘The utility of the con- 
cept of conditioned reinforcement les in the 
prediction that an arbitrary stimulus may be- 
come a conditioned reinforcer. If only those 
stimuli that [at the instant they are presented] 
cannot be discriminated from primary reinforce- 
ment are effective, a separate concept is no 
longer required. 

. . . Although it may still be possible to in- 
voke conditioned reinforcement as an explana- 
tory mechanism for the behavior on different 
second-order schedules, at the present time... 
it is more parsimonious to explain the behavior 
in terms of the discriminative properties of the 
stimuli and of the salience of the stimuli for the 
particular organism. 


The present analysis and the results of Squires et al. 
(1975) suggest that behavior is well maintained on 
second-order schedules because of “conditioned con- 
fusion’? rather than the effectiveness of the paired 
stimulus as a conditioned reinforcer. Thus what had 
been the most impressive support for the pairing 
hypothesis—behavior on second-order schedules—may 
turn out not to be support for it at all. Additional 
results, which will be described in the context of 
choice (e.g., Schuster, 1969; Squires, 1972) are also in- 
consistent with the pairing hypothesis. While the pair- 
ing hypothesis may have served us well, it appears to 
be time to discard or modify it in the light of recent 
research. 

While most of the enhanced responding on second- 
order schedules may be due to a discrimination failure, 
as Squires et al. maintain, the high rates of respond- 
ing engendered by this failure probably mask some 
real, albeit small, effect of pairing. The effects of pair- 
ing may become manifest when sufficiently sensitive 
procedures, such as multiple schedules (as in de 
Lorge’s work, 1971), are employed. Nonetheless, most 
of the response enhancement that sometimes occurs 
on second-order schedules now appears to depend 
on pairing only in the following complex sense: 
When the subject cannot discriminate either among 
components of a second-order schedule or between 
brief stimuli and primary reinforcement, poor 
temporal control results and response rates are en- 


3] thank Dr. Jim Norborg for turning this phrase. 
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hanced; paired brief stimuli may impair such discrim- 
inations. Depending on the stimuli selected and on 
the schedule of reinforcement, discrimination (and 
hence temporal control) on unpaired brief-stimulus 
schedules may be superior to that on paired _brief- 
stimulus schedules. If so, response rates on the paired 
schedule would exceed that on the unpaired schedule. 
If discrimination among components or between un- 
paired stimuli and primary reinforcement is poor, 
however, response rates on the paired and unpaired 
schedules would be equivalent. 


OBSERVING RESPONSES AND 
CONDITIONED REINFORCEMENT 


In Wyckoff’s (1952) observing response procedure, 
periods during which key pecking was reinforced with 
food according to FI schedules alternated with periods 
of extinction. The response key remained white 
throughout both periods, unless the pigeon pressed a 
pedal which turned the key red or green. When the 
two colors were correlated with the schedule in effect, 
much more pedal pressing was maintained than when 
the stimuli and schedules were uncorrelated. Wyckoff 
referred to the pedal presses as “observing responses,” 
l.€., responses resulting in the presentation of a pair 
of discriminative stimuli. Other early studies of ob- 
serving responses include those of Prokasy (1956) and 
Kelleher (1958). 

One important explanation of observing responses 
stresses the information obtained from them and 
stipulates that uncertainty reduction reinforces ob- 
serving behavior. A second explanation stresses the 
conditioned reinforcing strength of the stimulus corre- 
lated with the more positive outcome. Some form of 
the first interpretation—which has been called the wn- 
certainty reduction hypothesis—has been favored by 
Berlyne (1957, 1960), Bloomfield (1972), Hendry 
(1969b), Lieberman (1972), Schaub (1969), and Schaub 
and Honig (1967), among others, while some form of 
the second interpretation—which has been called the 
conditioned reinforcement hypothesis—has been fa- 
vored by Dinsmoor, Browne, and Lawrence (1972), 
Jenkins and Boakes (1973), Kelleher and Gollub 
(1962), Kendall (1973a), and Mulvaney, Dinsmoor, 
Jwaideh, and Hughes (1974), among others. 

There are many differences among these general 
interpretations, not all of which can concern us here. 
We should point out, however, that the uncertainty 
reduction and conditioned reinforcement hypotheses 
can have much in common. In particular, a condi- 
tioned reinforcement hypothesis can specify that the 
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strength of the stimulus associated with the more 
positive outcome depends upon an informative func- 
tion. Consider two possible hypotheses about condi- 
tioned reinforcement strength: One, the traditional 
pairing hypothesis, states that the conditioned rein- 
forcing strength of the stimulus derives from the pair- 
ing of the stimulus with primary reinforcement irre- 
spective of any informative function; a second 
describes the conditioned reinforcing strength of the 
stimulus in terms of its informativeness about the 
immediacy of primary reinforcement. This second 
statement is more compatible with the uncertainty 
reduction hypothesis, though it remains distinct from 
it in at least one important respect, as we shall see 
below. The uncertainty reduction hypothesis may or 
may not be described in terms of conditioned rein- 
forcement. In principle, it is compatible with condi- 
tioned reinforcement, since one could say: “A stim- 
ulus which becomes a conditioned reinforcer does so 
because it reduces uncertainty.” Or one may eschew 
the term conditioned reinforcement, stating instead 
that information about reinforcement—i.e., uncer- 
tainty reduction per se—is a primary reinforcer. 

Whatever the form of the conditioned reinforce- 
ment or uncertainty reduction hypotheses, however, 
the two hypotheses may be distinguished, as Dinsmoor 
et al. (1972), Jenkins and Boakes (1973), and others 
have pointed out: “The assertion that the negative 
value as well as the positive value of the informative 
stimulus variable reinforces the observing response is 
what distinguishes the uncertainty reduction hypoth- 
esis from the conditioned reinforcing hypothesis of 
observing behavior” (Jenkins & Boakes, 1973, p. 198). 

In this section we shall first consider evidence on 
the question whether information about negative out- 
comes is reinforcing. We shall then summarize evi- 
dence from observing response studies which have 
varied the probability of the positively valued alterna- 
tive in order to develop a quantitative formulation of 
observing behavior. The resultant hypothesis of ob- 
serving behavior will be a form of the conditioned 
reinforcement hypothesis consistent with the theory 
of conditioned reinforcement suggested by studies of 
choice behavior to which IJ shall then turn. 


Are Negative Stimuli Reinforcing? 


As Bloomfield (1972) has stated: 


Another way in which information-transmission 
theory confronts reinforcement-contiguity theory 
in the field of secondary reinforcement is 
through the function assigned to signals that 
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precede a negatively valued environmental 
event. According to contiguity theory, a signal 
of that kind must become aversive, or at least 
much less reinforcing than the other cues avail- 
able. The information hypothesis, on the con- 
trary, requires that “bad news” be just as much 
“news” as “good news” and so does not differ- 
entiate these two cases. (p. 194) 


There is also the position that predicts intermediate 
results: the negative stimuli should be reinforcing, but 
less so than the positive stimuli (Hendry, 1969b; 
Schaub, 1969). In either case, the negative stimuli 
should be reinforcing, not aversive. It should be added 
that the conditioned reinforcement hypotheses would 
also assign conditioned reinforcing value to “bad 
news” if the news could be acted on to increase the 
likelihood of negative reinforcement (in an escape or 
avoidance procedure) or of positive reinforcement (in 
an alternative response procedure). As far as I know, 
the experiments suggested by these latter predictions 
have not been carried out. 

Three studies have been widely cited as evidence 
that “bad news” is reinforcing (Lieberman, 1972; 
Schaub & Honig, 1967; Schaub, 1969). Schaub and 
Honig’s procedure consisted of alternation between 
periods of reinforcement and extinction for both 
“master” pigeons and yoked control pigeons. In rein- 
forcement periods, pecks at a white key by the master 
subjects were reinforced on a variable-interval (VI) 1- 
min schedule and also produced a change in the color 
of the key from white to red—for 1.5 sec—on a fixed- 
ratio (FR) 3 schedule. In extinction periods, pecks at 
the white key were never followed by primary rein- 
forcement, but pecks did produce a change in the 
color of the key from white to green (also on an FR 3 
schedule). Yoked subjects received the same cues inde- 
pendent of their responding. Schaub and Honig found 
that the master pigeons pecked at a high rate during 
extinction periods (although not at as high a rate as 
in the reinforcement periods), suggesting that the 
production of the green stimulus (associated with 
extinction) reinforced responding during the extinc- 
tion period. The yoked subjects generally responded 
only during reinforcement periods. Schaub (1969) 
then conducted a more elaborate study with the same 
basic procedure, except that responding could produce 
only a single stimulus: either the positive stimulus, 
associated with the VI schedule, or the negative stim- 
ulus, associated with extinction. He found that when 
subjects could produce only positive stimuli (i.e., when 
responding was completely ineffective during extinc- 
tion periods), they responded more in extinction than 
did yoked control subjects (who again received key 
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light changes whenever the experimental subjects 
produced them). In a later portion of his experiment, 
Schaub provided response-independent stimuli which 
made the response-produced stimuli redundant for the 
experimental birds. This eliminated performance 
differences between the experimental and control sub- 
jects. When the response-independent stimuli were 
then eliminated, response rates during extinction in- 
creased for the experimental birds, suggesting that re- 
sponding was reinforced by the response-produced 
negative stimuli when they were no longer redundant. 
As Dinsmoor (Dinsmoor, Flint, Smith, & Viemei- 
ster, 1969; Dinsmoor et al., 1972), Wilton and Clements 
(1971b), and Bloomfield (1972) have pointed out, how- 
ever, the Schaub studies are not completely convinc- 
ing. In Schaub and Honig (1967), for example, re- 
sponding in the procedure with only negative stimuli 
may have been maintained by the occasional presenta- 
tion of the positive stimulus after the period of ex- 
tinction terminated (a possible interpretation of the 
results raised first by Schaub and Honig). Moreover, in 
either of the Schaub studies the absence of stimulus 
change after three responses may have signified that 
the positive component was in effect. Dinsmoor et al. 
(1972) made observing behavior effective on an 
aperiodic schedule of reinforcement (either VI 1-min 
or VI 2-min schedules) on which the absence of stim- 
ulus change was the usual consequences of a peck to 
the observing key (and hence could not reduce uncer- 
tainty). Thus the only significant uncertainty reduc- 
tion was provided by stimulus change. In addition, 
observing responses were made on a different key than 
food responses. If the uncertainty reduction hypothesis 
were correct, observing responding should be main- 
tained equally well if its only consequence were 
production of the negative stimulus (correlated with 
an extinction period on the food key) or if its only 
consequence were production of the positive stimulus 
(correlated with responding on an interval schedule 
of positive reinforcement). On the other hand, if the 
conditioned reinforcement hypothesis were correct, 
observing responding should be maintained only if 
the positive stimulus were sometimes produced. 
Dinsmoor et al. (1972) found that observing be- 
havior was well maintained when only the positive 
stimulus could be produced, as shown in the left 
panel of Figure 2, but that responding was eliminated 
when the only consequence of pecking the observing 
key was production of the stimulus signifying ex- 
tinction periods (the center panel of Figure 2). 
Finally, when both stimuli were available, pigeons’ 
observing behavior was maintained at an intermediate 
rate, raising the possibility that production of the 
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Fig. 2. Rate of observing on successive sessions by one pigeon 
(4144) when observing responses produced. S* only (left panel), 
S~ only (center panel), or both stimuli (right panel). (From 
Dinsmoor et al., 1972. © 1972 by the Society for the Experimen- 
tal Analysis of Behavior, Inc.) 


negative stimulus had actually been punishing. The 
data shown in Figure 2, for one of Dinsmoor’s 
pigeons, were typical for each of the five pigeons in 
the experiment, except that for one bird results in the 
left and right panels were comparable, 

several other studies have supported the contention 
of Dinsmoor et al. (1972) that negative stimuli do not 
reinforce observing behayior (e.g, Blanchard, 1973; 
Dinsmoor, Browne, Lawrence, & Wasserman, 1971: 
Dinsmoor, et al., 1969; Jenkins & Boakes, 1973; Ken- 
dall, 1973a; Mulvaney, Dinsmoor, Jwaidch, & Hughes, 
1374). For example, Dinsmoor et al. (1971) used a 
procedure in which pigeons could produce a display 
of either the positive or negative discriminative stim- 
ulus on a key as long as they stood on a pedal. They 
left the pedal, however, as soon as the negative stim- 
ulus appeared. Mulvaney et al. (1972) studied two 
pigeons in an observing procedure with three keys. 
During alternating periods of unpredictable duration, 
responding on the center (food) key was reinforced 
on a VI schedule or was never reinforced. In the ab- 
sence of observing, the color of all three keys was 
yellow. On identical but independent VI observing 
schedules, responding on either of the two side keys 
produced either a positive stimulus (green) on all 
three keys, if the VI food schedule were in effect, or a 
negative stimulus (red) on all three keys, if extinction 
were in effect. In the critical stage of the experiment, 
the negative stimulus could not be produced by re- 
sponding on one of the two side keys; responding on 
the other side key continued to produce both positive 
and negative stimuli. The subjects responded at a 
higher rate on the key that produced only the positive 
stimulus, suggesting that the negative stimulus pun- 
ished responding that produced it. 

Blanchard (1975) studied eight pigeons on a single- 
key, discrete-trials observing procedure. Pecks during 
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a trial produced colored key lights which signaled 
whether the trial would end with response-independent 
reinforcement or with no reinforcement. These stim- 
uli were produced on a VI schedule which began 
operating at the onset of the trial. A procedure was 
employed which permitted the bird to produce $+ 
on those trials in which reinforcement would be 
delivered with or without producing S~ on nonrein- 
forced trials. In one condition, only a response 
preceded by at least 6 sec of nonresponding could pro- 
duce S~, while any response that satisfied the VI 
requirement produced $+. In another condition, this 
contingency was reversed. Thus the pigeons could 
selectively produce only $+ (or only S—) by adjusting 
their interresponse times appropriately. The pigeons 
generally produced fewer negative stimuli in the 
course of training, which indicates that S~ was punish- 
ing observing behavior. 

Auge (1974) has extended these observations to a 
situation in which both alternatives involved pri- 
mary reinforcement, but in which one was more posi- 
tive than the other. He studied mixed fixed-ratio, 
hxed-interval schedules and found that observing be- 
havior is maintained by the occasional presentation 
of the stimulus signaling the shorter delay to rein- 
forcement (i.e. the schedule with the shorter inter- 
reinforcement interval). Observing responses produced 
stimuli signaling whether an FI 30-sec or an FR X 
schedule was in effect. When the stimulus signaling 
the FI schedule was eliminated, observing behavior 
was maintained when the FI 30-sec schedule alternated 
with the low-valued FR (c.g., 20 or 30) but not with 
large FRs (e.g., 100, 140, or 200). The converse was true 
when the stimulus signaling the FI 30-sec schedule 
was the only one that could be produced by ob- 
serving responses; observing behavior was only main- 
tained when the FI 30-sec schedule alternated with 
large FRs. In other words, only stimuli associated with 
the shorter delay to primary reinforcement maintained 
observing behavior (a result consistent with those 
from an earlier study by Kendall & Gibson, 1965). 
Auge (1973) has also shown that observing responding 
is maintained by the stimulus signaling the larger of 
two reinforcement magnitudes but not by the stimulus 
signaling the smaller. 

Lieberman’s (1972) study provides more convinc- 
ing evidence for the uncertainty reduction hypothe- 
sis than Schaub’s (1969). Nonetheless, his results are 
also interpretable in several ways, some of which have 
been discussed by Dinsmoor et al. (1972) and by Mul- 
vaney et al. (1974). In Lieberman’s basic procedure, a 
variable-ratio (VR) schedule alternated with extinc- 
tion and no exteroceptive stimuli were associated with 
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the two types of components. Hence the schedule was 
mix VR EXT. Observing responses on a different 
lever produced 6 sec of exposure to a stimulus associ- 
ated with the component in effect. In one experiment, 
the VR value was varied between 5 and 100 and the 
effects on observing responding were studied. Lieber- 
man reasoned that a conditioned reinforcement view 
of observing responding required a decrease in the 
rate of observing responding as the VR value in- 
creased (and hence the density of primary reinforce- 
ment associated with the positive stimulus decreased). 
Instead, an increase was noted (particularly between 
the conditions in which the VR requirement was 5— 
resulting in about three responses per minute—and 
25—resulting in about 5 responses per minute). While 
the result is indeed consistent with an information 
view of observing behavior, it is also consistent with 
other possibilities. In the first place, Lieberman’s pro- 
cedure was essentially a concurrent schedule of food- 
reinforced responding and observing responding. From 
what is known about interactions on concurrent 
schedules (e.g. Catania, 1966), it would be expected 
that the greater the VR value for food-reinforced 
responding, the more responding should be main- 
tained by any concurrently available schedule (in this 
case, observing responding). For this reason, more 
observing responding was maintained with higher VR 
values on the food-reinforced lever. In addition, in the 
VR component, where about three responses per sec- 
ond were emitted, an observing response delayed 
primary reinforcement as a result of the changeover 
delay (COD) requirement in effect. A COD was used 
to minimize interaction between responding on the 
two levers; thus primary reinforcement could not be 
obtained for a food lever response until at least 11% 
sec after any response on the observing response lever. 
This delay was comparable to the time required for 
the subject to obtain primary reinforcement on a 
VR 5. Since this delay would be more noticeable with 
lower VR values, the failure to obtain an increase in 
observing responding with such values is not sur- 
prising. 

Similar arguments and others may be leveled 
against the remaining portions of Lieberman’s ex- 
periment (see also Dinsmoor et al., 1972, p. 80). As a 
last example, consider the main result of his final 
experiment (which in my opinion is the most inter- 
esting), in which observing responding during the 
extinction component of MIX VR 50 EXT no longer 
produced the negative stimulus (“S~’’) after the first 
ten (base line) sessions (although the positive stimulus 
—the “S+’’—could be produced during the VR 
component). The result, averaged over five monkeys, 
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is shown in Figure 3. Lieberman argues plausibly that 
the decline in observing response rates during the 
extinction component after the S~ is removed sug- 
gests that the S~ had been a conditioned reinforcer. 
However, the decline is also compatible with the 
notion that observing behavior had been maintained 
by stimulus change. If responding during the extinc- 
tion component is reinforced by stimulus change, then 
less responding should occur once stimulus change is 
no longer available. Moreover, as Dinsmoor et al. 
(1972) have pointed out: 


To show that the light had been functioning as a 
reinforcer, Lieberman then eliminated it from 
his procedure and found that this led to a sharp 
decline in the frequency with which observing 
responses were recorded during the extinction 
component. But the situation without the light 
was not entirely comparable to the situation 
with the light. During the baseline determina- 
tion, the presence of the tone or the light in- 
dicated that an observing response would have 
no consequence; conyersely, the absence of the 
tone or the light set the occasion for observing. 
When the light was climinated from the pro- 
cedure, there was no stimulus to prevent the 
animal from responding during the 6-sec peried 
when the light would otherwise have been pres- 
ent. To maintain comparability with the base: 
line procedure, Lieberman did not record these 
responses. But some of them may have been 
responses that under the baseline procedure 
would have been postponed until after the light 
had terminated and that would therefore have 
been recorded. Note that Schaub (1969), using 
a different recording procedure, did not find a 
corresponding decline in observing when he 
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Fig. 3. Rate of observing (averaged over five subjects) on suc- 
cessive sessions before and after removal of the S- in the final 
experiment. (From Lieberman, 1972.) 
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withheld S— (Experiment 2a). It is difficult to 
see any way in which the data can safely be 
compared with and without the light in Lieber- 
man’s experiment. (p. 80) 


Thus while Lieberman’s results are interesting and 
constitute the only truly provocative support of the 
notion that bad news is reinforcing, they do not per- 
mit a reasonably unambiguous conclusion. Moreover, 
in view of the large body of unambiguous evidence 
suggesting that observing behavior is maintained only 
by stimuli correlated with positive reinforcement, it 
appears best to suspend judgment on the significance 
of Lieberman’s results. 

In conclusion, observing behavior is maintained by 
the stimulus associated with any of the following: (1) 
the presence of a schedule of positive reinforcement 
(as opposed to extinction—e.g., Dinsmoor et al., 1971; 
Wyckoff, 1952); (2) the shorter of two delays to rein- 
forcement (e.g., Auge, 1974; Kendall & Gibson, 1965); 
(3) the larger of two magnitudes of reinforcement 
(c.g, Auge, 1973); (4) the absence of a schedule of 
punishment (e.g., Dinsmoor et al., 1969). Observing 
behavior 1s not maintained when there are no differ- 
ential outcomes (e.g., Kendall, 1972; Wilton & Clem- 
ents, 1971a) or when only the stimulus associated with 
the less valued of the two outcomes may be observed 
(Dinsmoor et al., 1972). Thus the weight of evidence 
Clearly favors the conditioned reinforcement view of 
observing behavior over the uncertainty reduction 
yicw. Or in terms of our earlier discussion, the uncer- 
tainty reduction hypothesis is tenable only when 
restricted to uncertainty reduction which provides in- 
formation that primary reinforcement is forthcoming. 
In that case, however, the hypothesis becomes indis- 
tinguishable from a conditioned reinforcement hy- 
pothesis that describes the strength of a stimulus in 
terms of its relation to primary reinforcement. More- 
over, as Gollub (1970), Eckerman (1973), and McMil- 
lan (1974) have pointed out, the appeal of the 
information hypothesis, expressed in terms of uncer- 
tainty reduction, depends upon the quantitative pre- 
dictions of information theory (Shannon & Weaver, 
1949; Wiener, 1948). Once these predictions are shown 
to be incorrect when applied to the study of observ- 
ing responses, there seems to be little reason to ‘‘save”’ 
a modified version of the uncertainty reduction hy- 
pothesis in favor of one based on conditioned rein- 
forcement. I now turn to studies which have tested 
quantitative implications of the uncertainty reduc- 
tion hypothesis by examining the strength of observ- 
ing as a function of changes in the likelihood of the 
positive stimulus. 
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Studies Varying the Likelihood of Positive Stimuli 


Several investigators have tested implications of the 
uncertainty reduction hypothesis by varying the prob- 
ability that observing responses will lead to positive 
stimuli (and, subsequently, positive reinforcement). 
According to the uncertainty reduction hypothesis, 
the amount of observing behavior should be described 
by the symmetrical function shown in Figure 4A. This 
inverted U-shaped function describes the average 
amount of information transmitted by both the posi- 
tive and negative stimuli as a function of the proba- 
bility of the positive stimulus. Since the function is 
symmetrical, observing should be just as strong when 
the probability of the positive stimulus is p as when it 
is 1 — p (e.g., when p = .20 and .80). Figure 4B illus- 
trates a function based only upon the amount of in- 
formation transmitted by the positive stimulus. In 
intuitive terms, this function says that observing will 
increase as a function of the uncertainty reduction 
conveyed by stimuli associated with positive out- 
comes; the more unlikely the good news is, the more 
reinforcing when it does come. This function also 
implies that negative stimuli do not affect observing. 
This implication is responsible for the asymmetry. 

If this asymmetrical function accurately describes 
observing, then more observing should occur when 
p= .20 than when p =.80. Wilton and Clements 
(1971b) tested this prediction. In their experiment, 
responding on an FI schedule produced both a stim- 
ulus signaling whether or not reinforcement was forth- 
coming and the delayed outcome of the trial: either 
nonreinforcement or response-independent reinforce- 
ment. Thus the same response was required to ad- 
vance to the trial outcome stage as well as to produce 
stimuli correlated with the specific outcome. The 
probability of the positive stimulus (and hence rein- 
forcement) was .80 in one condition and .20 in the 
other, For each of the six pigeons in the one com- 
parison made by Wilton and Clements, more observ- 
ing occurred when p = .20 than when p = .80. This 
result suggests that the symmetrical function is incor- 
rect and supports the asymmetrical function. Note 
that this result is consistent with those discussed in 
the previous section: stimuli correlated with negative 
outcomes do not appear to maintain observing. It 
should be noted that Wilton and Clements did not 
include a control condition in which responses ad- 
vanced the subject to the trial outcome stage (as in 
their actual procedure) but did not produce observing 
stimuli. Nor did they include a condition in which 
the subject advanced to the trial outcome stage 
whether or not it made an observing response. With- 
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Fig. 4. (A) The amount of information transmitted by both S* 
and S~ as a function of the probability of S*. (B) The amount 
of information contributed by S* to the average amount of 
information, again as a function of the probability of S*. 
(Adapted from Wilton & Clements, 1971b. © 1971 by the Society 
for the Experimental Analysis of Behavior, Inc.) 


out these controls it is not clear what maintained 
responding in their study. 
Fortunately, other data consistent with the asym- 
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metrical function have been reported by McMillan 
(1974), McMichael, Lanzetta, and Driscoll (1967), 
Hendry (1965), and Eckerman (1973). In perhaps the 
most comprehensive study, McMillan (1974) allowed 
pigeons to convert a mix VI EXT schedule of food 
reinforcement to a corresponding multiple schedule 
(mult VI EXT). McMillan varied the probability that 
the VI component was in effect between the following 
values: .00, .20, .35, .50, .65, .80, and 1.00. His prin- 
cipal finding is shown in Figure 5. The data for one 
pigeon (310) are inconsistent with either of the func- 
tions under discussion. Of the remaining five, the data 
from four (all except 309) suggest an asymmetrical 
function similar to that in Figure 4B, Certainly, Mc- 
Millan’s data do not support a symmetrical function.‘ 

Eckerman (1973) and McMillan (1974) have raised 
an intuitive objection to the Wilton-Clements hy- 
pothesis. They question the significance of consider- 
ing the informativeness of positive signals in isola- 
tion. Indeed, it is logically impossible to have only 
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Fig. 5. Percentage of time spent in the presence of the multiple- 
schedule stimuli as a function of the probability of the positive 
stimulus. (Adapted from McMillan, 1974. © 1974 by the Society 
for the Experimental Analysis of Behavior, Inc.) 


4 McMillan’s study also reports interesting data arguing 
against the discriminative stimulus hypothesis of conditioned 
reinforcement. 
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positive information. Certainly, a meaningful applica- 
tion of information theory requires consideration of 
both positive and negative outcomes. There is also 
empirical evidence against the appropriateness of an 
asymmetrical observing response function. In the first 
place, Kendall (1973b) found that observing behavior 
was better maintained when the probability of a posi- 
tive outcome was .25 than when it was .50 or .75. But 
according to Figure 4B, the amount of uncertainty 
reduction is equal when p = .25 and .50 (in each case, 
p log [1/p] = .50). Similarly, Steiner (1970) found as 
much or more observing behavior when the probabil- 
ity of a positive outcome was .1 than when it was .5, 
an outcome opposite to that suggested by the shape of 
the function of Figure 4B. Most significantly, the 
function shown in Figure 4B has a telling empirical 
shortcoming when dealing with experiments on the 
effects of negative outcomes. As I discussed above, con- 
vincing evidence that the stimuli associated with 
negative outcomes actually punish observing behavior 
has been supplied recently by Mulvaney et al. (1974) 
and Blanchard (1975), This evidence undermines the 
rationale for the asymmetrical function—namely, the 
assumption that negative outcomes do not affect ob- 
sérving. Wilton (1972) has modified his own view to 
suggest that the appearance of negative stimuli pun- 
ishes observing behavior rather than having no effect 
(as suggested by Wilton and Clements, 1971b). Wilton’s 
view 1s just the opposite of a pure uncertainty reduc 
tien view (held by Berlyne, 1960; Bloomfield, 1972; 
and others) which specifies that information about 
negative events should be positively reinforcing. Wil- 
ton does not propose 4 quantitative theory specifying 
haw punishing the negative stimuli are, except to say 
that they are less effective in controlling behavior than 
are the positive stimuli. In practice, this will have the 
effect of moving the peak of the function in Figure 
4B toward a lower probability value by some un- 
specified amount. This change appears to make Wil- 
ton’s later hypothesis consistent with the results of 
Blanchard (1975), Kendall (1973b), Steiner (1970), 
and Dinsmoor’s group, which are inconsistent with the 
Wilton-Clements hypothesis. Thus while Wilton’s 
(1972) newer hypothesis may be less valuable than his 
older one in that it is both nonquantitative and more 
complex, it does have the advantage of being consis- 
tent with more of the data. In any case, the new Wil- 
ton theory is not only manifestly inconsistent with the 
uncertainty reduction hypothesis (based on the func- 
tion shown in Figure 4A) but, in stressing both the 
reinforcing and punishing functions of signals, it has 
more in common with the conditioned reinforcement 
hypothesis than with information views. 
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The results reviewed in this section may be ac- 
counted for in terms of a conditioned reinforcement 
theory of observing behavior (such as that of Dins- 
moor’s group), as McMillan has most recently sug- 
gested. Our argument begins with the fact that mixed 
schedules of reinforcement are less reinforcing, as 
measured in choice procedures, than equivalent multi- 
ple schedules (e.g., Bower, McLean, & Meacham, 
1966; Hendry, 1969c; Hursh & Fantino, 1974), what- 
ever the likelihood of the positive component, ex- 
cept at the end points: when p = 1.0 and p = 0, the 
multiple and mixed schedules are equivalent. For in- 
termediate values, the difference in conditioned rein- 
forcing strength between the multiple and mixed 
schedules should increase as p decreases.5 ‘Thus, con- 
ditioned reinforcing strength should increase mono- 
tonically as the probability of a positive outcome 
decreases, until the function approaches p = 0 (ex- 
tinction), at which point observing should not occur. 
At the same time, as the probability of a positive out- 
come becomes very small, observing should decline 
since it is rarely reinforced. The observing response 
function generated by these two factors should resem- 
ble an asymmetrical inverted U-shaped function. 

The delay reduction hypothesis makes comparable 
qualitative predictions. For example, consider one of 
the schedules used by McMillan (1974), a mixed VI 
70-se¢ EX’T schedule with equiprobable 40-sec com- 
ponents, Observing responses changed this mixed 
schedule to the equivalent multiple schedule. In the 
presence of the mixed schedule, the average delay to 
reinforcement at the beginning of a trial is 140 sec. 
Whereas production of the negative stimulus is not 
correlated with reduction in delay to reinforcement, 
and should not maintain observing, the onset of the 
positive stimulus is correlated with an average reduc: 
tion of 70 sec (since it is correlated with the VI 70-sec 
component), or one-half the average time correlated 
with the mixéd schedule, and should maintain ob- 
serving. ‘he lower the probability of the VI compo- 
nent, the greater the reduction in delay to primary 


5 When the probability of the positive component is high, the 
mixed schedule value is close to that of the positive outcome. 
As the probability of the positive outcome decreases, the mixed 
schedule value approaches that of the negative outcome. Many 
studies of how subjects average positive and negative outcomes 
(e.g., Bower, McLean, & Meacham, 1966; Davison, 1969, 1972; 
Fantino, 1967; Fantino & Navarick, 1974; Herrnstein, 1964b; 
Hursh & Fantino, 1973; Killeen, 1968a; Navarick & Fantino, 1974) 
have shown that the positive outcome is weighted far more 
heavily than the negative in choice. Since the discrepancy be- 
tween the positive outcome on the multiple schedule and the 
value of the mixed schedule is greater, the more unlikely the 
positive outcome, it follows that preference for the multiple 
schedule should be an inverse function of the likelihood of the 
positive outcome. 
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reinforcement correlated with the positive stimulus. 
For example, when the probability of the positive out- 
come is .1, the delay to reinforcement in the presence 
of the mixed schedule is 700 sec and the reduction in 
time to reinforcement signified by the positive signal 
is 630 sec or 630/700 = .9 the average delay signified 
by the mixed schedule. This particular view of con- 
ditioned reinforcement is appealing because it is 
consistent with the results of the experiments on 
choice that will be discussed in the next section of the 
chapter. Note that this conditioned reinforcement 
view—like that of Dinsmoor’s group—requires that the 
probability of observing behavior increase as the like- 
lihood of the positive outcome decreases, until some 
unspecified maximum point is reached. The descend- 
ing part of the curve makes good sense, of course: 
when # is sufficiently low, observing behavior is rarely 
reinforced and should decline. 


Observing Responses; Present Status 


The clearest conclusion that may be drawn from 
the results and theories that have been discussed thus 
far is that the uncertainty reduction hypothesis of 
observing behavior (e.g., Bloomfield, 1972) is incor- 
rect. Rather drastic modifications of this hypothesis 
which assume that negative signals are ineffective 
(e.g., Wilton & Clements, 1971b) or punishing (Wilton, 
1972) make the uncertainty reduction view largely 
equivalent to those stipulating that observing behav- 
lor is maintained by conditioned reinforcement (e.g., 
Dinsmoor et al., 1969, 1972; Jenkins & Boakes, 1973). 
One form of the conditioned reinforcement hypothe- 
sis is inconsistent with some of the observing response 
data. In particular, a hypothesis that requires a stimu- 
lus to maintain observing behavior when it is paired 
or contiguous with primary reinforcement would have 
dificulty handling Auge’s (1973, 1974) results: pri- 
mary reinforcement was obtained in both components 
of the multiple schedules studied in his experiments, 
but observing behavior was maintained in only one 
component. In addition, as Wilton and Clements 
(1971a) have argued, the fact that little observing re- 
sponding is maintained when reinforcement is avail- 
able on every trial is also difficult to reconcile with 
the pairing hypothesis: as both stimuli in their con- 
tinuous reinforcement condition were consistently 
paired with primary reinforcement, they should be 
strong conditioned reinforcers and should maintain 
observing behavior as well as they do in the condition 
in which reinforcement is available in the presence of 
only one of the two stimuli. Kendall (1972) obtained 
results similar to those of Wilton and Clements (197 1a) 
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with multiple delays of reinforcement; more observing 
responding was maintained when one of two 15-sec de- 
lays was increased to 120 sec. While these results are 
indeed incompatible with the traditional pairing hy- 
pothesis, they are completely consistent with both the 
reinforcement density and the delay reduction hy- 
potheses. When both components are correlated with 
the same reinforcement density, the stimulus of the 
mixed schedule is also correlated with the same rein- 
forcement density. Since observing does not produce a 
stimulus correlated with a higher density of reinforce- 
ment, observing should not occur. Similarly, neither 
multiple-schedule stimulus is correlated with a reduc- 
tion in time to reinforcement (relative to the mixed- 
schedule stimulus) when the schedules are equal. 
Hence observing should not occur according to the 
delay reduction hypothesis. These hypotheses predict 
that only the more positively valued of two stimuli 
should maintain observing behavior, as Branch (1970) 
has hypothesized, and as the results from Dinsmoor 
et al. (1972) and Auge (1974) suggest, since the less 
positive stimulus is correlated with an increase, rather 
than a reduction, in average time to primary rein- 
forcement (and a decrease, rather than an increase, in 
reinforcement density). Such stimuli should be aver- 
sive rather than reinforcing according to the delay 
reduction and reinforcement density hypotheses, 

In addition, these hypotheses predict that when 
each of two stimuli are associated with a reduction in 
delay to primary reinforcement (or an increase in 
reinforcement density), both should maintain observ- 
ing responding, and that more observing responding 
should be maintained by the stimulus associated 
with the greater reduction in time to reinforcement. 
For example, consider a procedure in which obsery- 
ing responses change a mixed FI 20-sec FI 40-sec FI 
180-sec schedule to the equivalent multiple schedule. 
According to the delay reduction hypothesis, the 
stimuli associated with both the FI 20-sec and FI 40- 
sec schedules should maintain observing behavior 
(since the average time to reinforcement in the mixed 
schedule equals 80 sec) with more observing respond- 
ing maintained by the FI 20-sec stimulus. If the 
schedule were changed to MIX FI 20-sec FI 120-sec 
FI 180-sec, however, only the stimulus associated with 
the FI 20-sec component of the multiple schedule 
should reinforce observing. (Since the average time to 
reinforcement in the mixed schedule is now 107 sec, 
the middle-valued stimulus—associated with FI 120- 
sec—no longer represents a reduction in time to rein- 
forcement.) The reinforcement density hypothesis 
makes equivalent predictions. 

Whether or not the form of the conditioned rein- 
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forcement hypothesis specified by the delay reduction 
hypothesis proves correct, it is clear that the condi- 
tioned reinforcement explanation of observing be- 
havior (e.g., Blanchard, 1975; Dinsmoor et al., 1972; 
Jenkins & Boakes, 1973; Mulvaney et al., 1974) is more 
consistent with the data than the uncertainty reduc- 
tion explanation (e.g., Berlyne, 1957; Bloomfield, 
1972; Lieberman, 1972). We now turn to studies of 
choice behavior. The picture that emerges from the 
choice literature is consistent with the one we have 
drawn from the study of observing responses. 


CHOICE AND CONDITIONED 
REINFORCEMENT 


Tha Dalay Raduetian Hypothesis 


The delay reduction hypothesis was developed 
(Fantino, 1969b) to integrate data from a series of 
experiments on choice and conditioned reinforcement 
begun by Autor (1960) and continued by many others 
in the subsequent sixteen years (cf. Fantino & Navarick, 
13974; Hendry, 1969). When applied to choice pro- 
cedures, the delay reduction hypothesis states that (1) 
organisms will choose the stimulus correlated with the 
greatest reduction in time to primary reinforcement 
and (2) preference will be greater the larger the dif- 
farancé in the delay réductions correlated with the 
chosen alternatives, Usually, the delay reduction hy- 
pothesis has been applied only in this sense of im- 
provement in temporal proximity to reinforcement. 
Obviously, other variables such as punishment and 
the probability and magnitude of réinforcement affect 
conditioned reinforcement. The delay reduction hy- 
pothesis may be broadened readily to encompass im- 
provement in reinforcer amount and probability and 
in punishment reduction, as I suggest below. 

First, consider a subject operating a two-levered 
gambling device with the following payoff structure: 
Pulls on either lever are reinforced (according to 
equal VI schedules) by access to one of two sets of 
flashing lights, each correlated with the delivery of a 
response-independent dollar bill. The dollar is de- 
livered after 5 min in the presence of one set of flash- 
ing lights, but after only 1 min in the presence of the 
other lights. After the dollar is received, the subject 
may again pull the levers leading to the flashing lights 
and more dollar bills. Pulls on one lever (when effec- 
tive—as determined by the VI schedule) always lead to 
the same outcome. Choice is measured by the relative 
rates of lever pulling leading to the two outcomes. 
Obviously, many theories of choice or conditioned 
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reinforcement—including a reinforcement density hy- 
pothesis, such as that considered when discussing 
observing—would require that the subject prefer the 
lights correlated with the 1-min delay. How much the 
l-min delay is preferred, however, depends upon an 
additional factor we have yet to specify: the size of the 
equal VI schedules leading to the outcomes. Consider 
the following three values for the equal VIs: 1, 6, and 
36 min. With VI 1-min schedules the average interval 
between the onset of a trial and receipt of the dollar 
is 31 min as long as the subject is responding on both 
levers. It takes 14 min, on the average, for concurrent 
VI I-min schedules to arrange reinforcement; the av- 
erage wait in the presence of the flashing lights (14 x 
I min + 14 & 5 min) is an additional 3 min. But by 
responding exclusively on the lever leading to the 
shorter delay, the subject can receive a dollar every 2 
min (1 min on the appropriate VI 1l-min plus the 
I-min delay). Indeed, for concurrent VIs of 4 min or 
less, the subject will increase his or her earnings by 
exhibiting exclusive preference for the 1-min delay: 
only that outcome is correlated with a reduction in 
waiting time to reinforcement. With VI schedules 
greater than 4 min, however, both outcomes are cor- 
related with delay reduction and the subject should 
respond on both levers. For example, with VI 6-min 
schedules it takes an average of 6 min to obtain a dol- 
lar (an average of 3 min of responding on the levers 
and 3 min in the presence of the flashing lights) if the 
subject responds on both levers, but 7 min if re- 
sponses are made exclusively on the lever with the 
shorter delay. Here the 5-min delay represents a re- 
duction of 1 min (1.e., the onset of the flashing lights 
corrélated with the longer delay brings the subject 1] 
min closer in time to the dollar than he had been 
while in the choice phase) whereas the I-min delay 
represents a 5-min reduction. Thus the short-delay 
outcome represents a reduction 5 times greater than 
the alternative outcome. The longer the VIs, the 
smaller this ratio. Hence, with the VI 36-min sched- 
ules, the delay reductions for the 1- and 5-min out- 
comes are 20 min and 16 min, respectively, a ratio of 
just 5:4. 

In summary, the delay reduction hypothesis states 
that subjects’ choice for the more preferred of two 
alternatives should increase, the shorter the choice 
phase: with sufficiently short VIs only one outcome 
represents an improvement in terms of temporal prox- 
imity to reinforcement; as the VIs are made progres- 
sively longer, both outcomes represent improvement 
and the subject should shift away from exclusive 
preference toward indifference. Note that a reinforce- 
ment density hypothesis assumes the choice will de- 
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pend only on the relative rates of reinforcement dur- 
ing the two outcomes and that choice should therefore 
be independent of the length of the choice phase 
(Autor, 1960, 1969; Herrnstein, 1964a). As we shall 
see, data instead support the delay reduction hypothe- 
S18. 

Since probability of reinforcement is closely analo- 
gous to frequency (probabilities of obtaining a dollar 
may be substituted for waiting times as the outcomes 
in the example above) and since punishment is con- 
ceptually and empirically symmetrical with reinforce- 
ment (though opposite in sign—cf. Fantino, 1973, for 
a review) there is no need to illustrate how the delay 
reduction hypothesis applies to these variables. We 
shall consider choice for reinforcer magnitudes, how- 
ever. Assume you are walking in a foreign city and 
are hungry. You know of two fine restaurants to 
which you had once been taken. You enjoy both but 
have a preference for one. You anticipate it will take 
about X min to track down each restaurant. Do you 
decide to dine at whichever you find first or will you 
hold out for the preferred one? I predict your decision 
would be based, in part, on the size of X (correspond- 
ing to the duration of the choice phase): the shorter 
the choice phase, the more likely you are to persist in 
finding the preferred restaurant. If X is very large, 
you are more likely to behave indifferently to the two. 
If this assumption is correct, then temporal context 
affects choice in the same way whether the outcomes 
differ with respect to rate or magnitude of reinforce- 
ment. Doug Navarick and I supported this notion 
recently in an analogous experiment with pigeons 
choosing between 4.5 and 1.5 sec access to grain: as the 
duration of the choice phase decreased—relative to the 
duration of the outcome phase—choice for the larger 
reward increased. 

Before discussing the delay reduction hypothesis 
and its implications more fully and rigorously and 
presenting supporting data, we should review briefly 
the concurrent chains procedure developed to study 
conditioned reinforcement and used to test the delay 
reduction hypothesis. 


The Concurrent Chains Procedure 


This procedure, which is described fully below, in- 
volves measuring choice by rate of response during 
two equal VI schedules, each leading to a different 
conditioned reinforcer. Since the concurrently avail- 
able VI schedules are equal, differences in response 
rates can be assumed to reflect differences in the rein- 
forcing effectiveness of the stimuli being chosen, 1.e., 
the stimuli associated with the terminal links of the 
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chain. The independent variable is some manipula- 
tion of the relation between the conditioned and the 
primary reinforcer. ‘The dependent variable is the 
relative strength of the two conditioned reinforcers as 
measured in the equal initial links: the number of 
choice responses made for one conditioned reinforcer 
divided by the total number of choice responses made 
for both conditioned reinforcers. 

Beginning with Autor (1960, 1969), the concurrent 
chains procedure has been used extensively in the 
study of both choice and conditioned reinforcement. 
While some investigators have used it primarily to 
study choice, others have emphasized its relevance for 
conditioned reinforcement. Indeed, the same depen- 
dent variable can be taken as both a measure of choice 
and of conditioned reinforcement, Before outlining 
the procedure we should point out its potential 
strengths and weaknesses. 

The procedure is a good one for studying choice 
between different schedules of reinforcement because, 
unlike simple concurrent schedules, the measure of 
choice is not confounded with the rates of responding 
generated by the schedules chosen. For example, re 
sponse rates on VR schedules tend to be much higher 
than on FI schedules. If we took the relative rates of 
responding on a simple concurrent VR FI schedule as 
our measure of choice, therefore, we would be stack- 
ing the deck in favor of the VR schedule. Such a pro- 
cedure would be more obviously inappropriate if we 
were comparing choice between a ratio schedule and a 
schedule which required low rates of responding. Per- 
haps to avoid such confusion of choice with the re- 
sponse rates generated by the schedules being chosen, 
Autor developed the concurrent chains procedure dia- 
gramed in Figure 6. In this procedure, the organism 
(normally a pigeon) responds on two concurrently 
available keys, each illuminated by the same stimulus. 
Responses on each key occasionally produce another 
stimulus, correlated with entry into the terminal link 
of the chain on that key. Entry into the two terminal 
links generally occurs at the same rate. Once the sub- 
ject enters one terminal link the other key becomes 
dark and inoperative. Responses in the presence of the 
terminal-link stimuli are reinforced with food. In 
most experiments the initial links are reinstated after 
the subject obtains a single reinforcement in a termi- 
nal link. The independent variable has generally in- 
volved some difference in the conditions arranged 
during the two terminal links. The dependent vari- 
able is the measure of choice: the responses in the 
initial links. 

Another advantage of the concurrent chains pro- 
cedure is that it keeps the number of primary rein- 
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Fig. 6. The concurrent chains procedure. Panel A indicates the 
sequence of events when responses on the left key are reinforced: 
panel B presents the analogous sequence on the right key. 
Responses in the presence of the colored lights (the stimuli of 
the terminal links) are reinforced with food according to some 
schedule of reinforcement (generally, the independent variable). 
The measure of choice is the relative rate of responding in the 
presence of the concurrently available white lights. Typically, 
equal VI schedules arrange access to the terminal links. (Adapted 
from Fantino, 1969b. © 1969 by the Society for the Experimen- 
tal Analysis of Behavior, Inc.) 


forcements for responding on each key close to the 
number intended by the experimenter over a wide 
range of preference for pecking one key or the other. 
If a subject responded exclusively on one key, for 
example, all primary reinforcements would be de- 
livered on that key, Because of the nature of concur- 
rently available VI schedules, however, the subject 
produces a higher rate of entry into the terminal links 
(and typically a higher rate of primary reinforcement) 
if it responds on both keys. In practice, this assures 
that the terminal link of each key will be entered 
equally often, even while the dependent variable— 
relative response rate in the initial links—is varying 
over a wide range. Thus the effects of number of 
reinforcements are not confounded with those of the 
intended independent variable. The procedure is also 
useful for studying conditioned reinforcement for two 
additional reasons. Normally, the effects of a condi- 
tioned reinforcer are examined in the absence of pri- 
mary reinforcement in order to avoid possible con- 
founding between the two reinforcing effects. In the 
terminal link of concurrent chains, a primary rein- 
_ forcement schedule can be maintained, and the condi- 
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Fig. 7. Relative rate of responding as a function of relative 
rate of primary reinforcement (left panels) and the relative 
probability of primary reinforcement (right panels). The diag- 
onal line from the origin to (1.0, 1.0) represents the locus of 
perfect matching between relative response rates and relative 
reinforcements. The other line represents the linear regression 
line through the data points. Each graph shows the calculated 
linear equation and the standard deviation around the regres- 
sicn line. (Adapted from Herrnstein, 1964a. © 1964 by the 
Society for the Experimental Analysis of Behavior, Inc.) 


tioned reinforcer does not therefore suffer extinction. 
Secondly, choice situations are known to be very sensi- 
tive to experimental manipulations (e.g., Catania, 
1963, 1966; Herrnstein, 1961; Rachlin, 1967), and this 
favors experimental differentiations among the 
strengths of different conditioned reinforcers. 

Most of the work with the concurrent chains pro- 
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cedure has examined the dependence of choice upon 
parameters of the interreinforcement interval (IRI), 
the interval between the conditioned reinforcement 
(i.e., a reinforced choice response) and the uncondi- 
tioned reinforcement (i.e., food). This is also the 
material most relevant to our theoretical treatment of 
conditioned reinforcement and to the integration of 
results from concurrent chains studies with those 
from studies of observing responses: both sets of re- 
sults indicate that the strength of a conditioned rein- 
forcer is determined primarily by the increase in 
proximity to primary reinforcement correlated with 
the conditioned reinforcer (the delay reduction hy- 
pothesis). We shall not, therefore, discuss other vari- 
ables which have been shown to affect preference in 
the concurrent chains procedure such as magnitude of 
reinforcement (e.g., Schwartz, 1969; Ten Eyck, 1970) 
or number of reinforcements (Fantino & Herrnstein, 
1968; Squires & Fantino, 1971). 


Choice with Aperiodic Schedules in Terminal Links 


Although all the research we shall summarize sup- 
ports the notion that choice is determined by the 
extent of reduction in delay to primary reinforce- 
ment correlated with the alternatives, the early work 
was consistent with a somewhat simpler interpretation 
which was accepted for nearly a decade. Specifically, 
both Autor (1960, 1969), using VI, VR, and response- 
independent schedules in the terminal links, and 
Herrnstein (1964a), using VI and VR schedules, found 
that the relative rates of choice responding (the num- 
ber of initial-link responses on one key divided by the 
total number of initial-link responses on the two keys) 
matched the relative rates of reinforcement (the rate 
of reinforcement on one key divided by the sum of the 
two rates of reinforcement) in the two terminal links. 
This relation may be summarized by the following 
equation: 


R, _ 1 /tor, (1) 

R,+ Rp 1/toz, + 1/tor 
in which R,; and R»z represent the number of re- 
sponses during the initial links on the left and right 


keys, respectively, and tj, and tz represent the aver- 
age durations of the left and right terminal links. 


R, T — toy 


Roth, (faye =i 
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Fig. 8. Average linear regression lines for relative rate of re- 
sponding during choice phase as a function of the relative rate 
of reinforcement (solid line crossing the diagonal line) and the 
relative probability of reinforcement (dashed line). The diagonal 
line from the origin to (1.0, 1.0) represents the locus of match- 
ing between choice and relative reinforcements. The subscripts 
of X in the equations (FT and p) correspond to the independent 
variables of relative rate (X,,) and relative probability (x p” 


(Adapted from Herrnstein, 196442. @ 1964 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


Figures 7 and 8 show Herrnstein’s basic results: the 
relation of choice responding to both the relative rates 
of reinforcement and the relative reinforcements per 
response (in the terminal link) for individual birds 
(Figure 7) and the averaged data (Figure 8). As Figure 
8 shows most clearly, choice responding matched rela- 
tive frequency of reinforcement more closely than 
relative probability of reinforcement. In this sense, 
Herrnstein’s study went beyond that of Autor in show- 
ing that the relative rate of primary reinforcement, 
rather than its relative probability, controlled the 
effectiveness of the conditioned reinforcers correlated 
with the terminal links of concurrent chains. 

Fantino (1969a, 1969b) suggested an alternative 
(now called the delay reduction hypothesis) to Equa- 
tion 1 which was also consistent with these data. 
Specifically: 


(when to, < T, tog < T) 


—— 1 (when tor, < ye lor > T) (2) 


= 0 (when tg; > T, ter < T) 
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where T represents the average-delay-to-primary rein- 
forcement from the onset of either initial link and 
tor, and tyz again represent the average durations of 
the left and right terminal links, respectively. Note 
that when entry into either terminal link produces an 
increase in the average delay to primary reinforce- 
ment (either t,;, > T or tap > T), Equation 2 requires 
the organism to emit all its choice responses to the 
other key. In other words, Equation 2 specifies when 
the organism should respond exclusively on one key. 
Of course, the case in which both ¢t,, and top are 
greater than T is impossible. 

Equation 2 implies that the relative rate of re- 
sponding in the initial links matches the reduction in 
the average delay to primary reinforcement correlated 
with entry into one terminal link relative to the 
reduction correlated with entry into the other. In 
other words, the greater the improvement, in terms of 
temporal proximity to reinforcement, correlated with 
the onset of the stimulus, the more effective it will be 
as a conditioned reinforcer, Like Equation 1, Equa- 
tion 2 assumes that the conditioned reinforcing effec- 
veness of a stimulus will be a function of its prox- 
imity to primary reinforcement (to, and top in 
Equations 1 and 2). Unlike Equation 1, Equation 2 
requires consideration of the temporal context in 
which this proximity is embedded. For cxample, an 
implication of Equation 1 is that the organism’s pref: 
erence for one conditioned reinforcer over another 
should bé invariant tégardless of the rate of entry into 
the terminal links, at least when the initial links are 
equal. In particular, if ts; and ts, were equal to 30 
and 90 sec, respectively, Equation 1 requires a choice 
proportion of .75 for f,, regardless of the value of the 
initial-link schédules. But one may question why the 
organism should ever cheese te enter the longer termi- 
nal link (top) if it can obtain almost immediate access 
to the shorter one. The more inaccessible the two 
terminal-link schedules—i.e., the longer the initial 
links—the more the organism should be indifferent to 
their difference, since with very long initial links, en- 
trance into either terminal link should be highly rein- 
forcing (just, as we saw earlier, observing is more 
frequent the less likely a positive outcome). Specifi- 
cally, Fantino (1969a) speculated that with sufficiently 
long initial links, organisms should be indifferent be- 
tween the different terminal links and that with sufh- 
ciently short initial links the organism should respond 
exclusively to the side leading to the shorter terminal 
link. If this is true, then only for a particular band of 
intermediate values should the distribution of choice 
responses match the distribution of reinforcements 
obtained in the terminal links (as required by Equa- 
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tion 1). Equation 2 also predicts the circumstances 
under which the organism should respond exclusively 
for the shorter terminal link: When entry into one 
terminal link fails to bring the organism closer in 
time to primary reinforcement that terminal link will 
not be entered. In terms of conditioned reinforcement, 
a stimulus that is not correlated with a reduction in 
average delay to reinforcement should not be a con- 
ditioned reinforcer regardless of how contiguous the 
onset of the stimulus may be to primary reinforcement 
(just as observing is not well maintained by stimuli 
correlated with positive reinforcement but with a 
decrease in proximity to reinforcement—e.g., Auge, 
1974), Thus when either value of ¢ is greater than T, 
entry into the longer terminal link actually moves the 
organism further from primary reinforcement, and the 
stimulus correlated with entry into that terminal link 
should not be a conditioned reinforcer, 

Since Autor’s and Herrnstcin’s data were consistent 
with both Equations | and 2, Fantino (1969b) tested 
the two equations by varying T while holding t; and 
fp Constant. Specifically, Fantino used three different 
pairs of identical VI schedules to arrange entry into 
the two terminal links which were always VI 30-sec 
and VI 90-sec. When the pair of initial-link schedules 
of intermediate duration was in effect (VI 120-sec), the 
cheice proportions in the initial links matched the 
relative rates of reinforcement in the terminal links 
(Equation 1), a result consistent with both formula- 
tions, With shorter or longer initial-link durations, 
however, the distribution of choice responses no longer 
matched the relative rates of reinforcement, but con- 
tinued to be well described by Equation 2. In addi- 
tion, Fantino utilized two schedules in which both the 
initial and terminal links were uncqual: chain VI 90- 
sec VI 30-sec vs. chain VI 30-sec VI 90-sec. Choice data 
from this procedure were also consistent with Equa- 
tion 2. 

The results from this experiment are shown in 
‘Table 1. These data show that for 15 out of 16 cases 
in which Equations | and 2 describe different choice 
proportions, Equation 2 provides a closer fit to the 
obtained data. Moreover, for each of the 16 points, 
Equation 2 accounts for the direction of the devia- 
tions from Equation |. ‘Thus for each of the 10 cases 
in which Equation 2 requires a higher choice propor- 
tion than Equation 1, columns (i) and (iv) in section 
B of Table 1 show that Equation 1 underestimates 
these proportions. Similarly, for each of the six cases 
in which Equation 2 requires a lower choice propor- 
tion than Equation 1, column (11) in section B of 
Table 1 shows that Equation 1 overestimates these 
proportions. ‘The data in column (iv) in section A of 
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Table 1 provide partial support for the prediction 
that the organism will cease responding for entrance 
into the longer terminal link when such entry moves 
him further in time from primary reinforcement. ‘This 
prediction is supported in that there was a very strong 
preference for the shorter terminal link and, for two 
of the six birds, exclusive preference was reached. 
There is a prediction of Equation 2 that seemed 
doubtful. Whenever the terminal-link durations fg, 
and top are equal, a choice proportion of .50—1.e., 1n- 
difference—is required no matter what the initial-link 


Table 1* 


331 


values are, since the two terminal-link stimuli are 
correlated with the same degree of reduction in aver- 
age delay to reinforcement. Instead, one might expect 
that preference would vary with the relative values of 
the initial links, since the rates of both primary rein- 
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A. Proportion of choice responses to key providing higher rate of reinforcement in ter- 
minal link for each pigeon in each of four conditions. The average proportion for 
each condition and the proportions required by Equations I and 2 are listed below 
the line. All VI values listed are in seconds. 


(2) (tt) (7it) (iv) 
Chain Chain Chain Chain 
VI 90 VI 30 VI 600 VI 30 VI 120 VI 30 VI 40 VI 30 
vs. chain vs. chain vs. chain vs. chain 

VI 30 VI90 VI 600 VI 90 VI 120 VI 90 VI 40 VI90 
Pigeon | 97 .66 74 98 
Pigeon 2 1.00 56 77 1.00 
Pigeon 3 83 .63 87 1.00 
Pigeon 4 .98 .63 .70 89 
Pigeon 5 aS 57 97 .96 
Pigeon 6 — 53 82 92 

Avg. 

Proportion 94 .60 81 95 
Equation | 15 75 15 15 
Equation 2 .90 55 75 1.00 


B. The deviations of the choice proportions above from the proportions required by 
Equations 1 and 2 for each pigeon for conditions (i)-(w). For each of the sixteen 
points in which Equations 1 and 2 make different predictions, the smaller deviation ts 
underscored. Column (v) on each side gives the mean of the absolute deviations for 
each pigeon. The means of the absolute deviations for each condition are listed below 


the line. 
Equation 1 Equation 2 
(i) @ Gi) Wy) ©) (i) (i) (id) @)) 
Pigeon 1 4.22 —.09 —.01 +.18 .12 4.07 +.11 —.01 —.07 .06 
Pigeon 2 +.25 —.19 +.02 4+ .25 .18 +.10 + .01 + .02 0 .03 
Pigeon 3 +.08 —.12 +.12 +4+.25 .14 —.07 +.08 +.12 0 07 
Pigeon 4 + .23 —.12 —.05 +.14 .14 +.08 +.08 —.05 —.11 08 
Pigeon 5 — —.18 +.22 4.21 .20 — +.02 +.22 —.04 .09 
Pigeon 6 — —.22 4+.07 +.17 .15 — —.02 +.07 —.08 .06 
Mean of 
absolute 
deviations .20 15 .08 20 .16 08° 05 .08 05 .06 


* (From Fantino, 1969b. © 1969 by the Society for the Experimental Analysis of Behavior, Inc.) 
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for the two keys when the value of the initial-link VI 
schedules (t;;, and tz on the left and right, respec- 
tively) are unequal. Therefore, a critical test of the 
generality of Equation 2 assesses whether indifference 
holds when the initial links are unequal for the two 
keys with tg; equal to top. 


Ry, Ty, (LT = tor) 
rr, (T — top) + rp (T — top) 
1 (when to, < T, top > T) (3) 


R, Re 
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Squires and Fantino (1971) supplied this test and 
suggested that an additional variable, which takes 
into account the rate of primary reinforcement on 
each key separately, should be incorporated into 
Equation 2: 


(when to, < T, top < F) 


= 0 (when t;, > T, tor < T) 


where rz, = nz/ (ty, + Nztz,), Which is the rate of pri- 
mary reinforcement on the left key (n;, is the number 
of primary reinforcements obtained during one entry 
into the terminal link of the chain on the left key); and 
Ta ="5/(lig + ete), the rate of primary reinforce. 
ment on the right key. Equation 8 has the additional 
advantage of predicting matching in choice behavior 
with concurrent VI schedules (Herrnstein, 1970), i.e., 
when tar = fop. 

Figure 9 illustrates the predictions made by Equa- 
tions 1 through 3 in two concurrent chaing situations. 
Figure 9a illustrates the case, studied by Autor (1960), 
Herrnstein (1964a), and Fantino (1969b), in which the 
terminal-link schedules are unequal (VI 30-see and VI 
90-sec in the example). Figure 9a shows the predic- 
tions made by each of the three equations as the value 
of the (equal) initial links increases. Figure 9b illus- 
trates the type of schedule studied by Squires and 
Fantino (1971) in which the two terminal links are 
equal but the initial links are unequal. While Equa- 
tions 1 and 2 require choice proportions of .50 regard- 
less of the difference between the two initial links, 
Equation 3 requires an increase in choice proportions 
on the key with the constant initial-link VI value as 
the size of the initial-link VI schedule increases on the 
other key, 

Squires and Fantino’s results unequivocally sup- 
ported Equation 3. Nonetheless, for the sake of clar- 
ity, we shall discuss the simpler, unmodified version 
in this chapter. This is acceptable since Equations 2 
and 3 make similar predictions with equal initial 
links, and since virtually all studies of concurrent 
chains schedules have used equal initial links. 

Thus data from these procedures suggest that the 
strength of a conditioned reinforcer is a function of 
the reduction in average delay to reinforcement cor- 
related with one conditioned reinforcer, relative to 
the reduction in average delay to reinforcement cor- 
related with the other. The delay reduction formula- 
tion is also useful in making ordinal predictions for 


binary choice, i.e., whether one schedule will be pre- 
ferred to another, when the terminal links consist of 
schedules other than VIs. This formulation must be 
restricted largely to VI schedules, however, where 
precise quantitative predictions are required. These 
restrictions (which also apply to the reinforcement 
density formulation represented by Equation 1) de- 
pend, in part, on two characteristics of choice. First, 
short interreinforcement intervals are weighted more 
heavily than long ones in choice, a fact consistent 
with the relative contribution to observing behavior 
made by the more positive stimulus as opposed to 
either the less positive or negative stimulus. Thus 
Herrnstein (1964b), using the concurrent chains pro- 
cedure, found that pigeons strongly preferred a VI 
schedule with an average interreinforcement interval 
of 15 sec to an FI 15-sec schedule. Comparable results 
were reported by Bower et al. (1966), Davison (1969, 
1972), Fantino (1967), Hursh and Fantino (1973), 
Killeen (1968a), and Navarick and Fantino (1975). 
Since variable schedules are preferred to fixed sched- 
ules providing the same reinforcement density (and 
the same mean delay reduction), an appropriate prin- 
ciple of transforming variable schedules into their 
fixed equivalents, and vice versa, would be necessary 
if a single quantitative model were to describe choice 
proportions for both types of schedules. Choice be- 
tween two schedules could then be described by first 
converting the schedules into their VI equivalents and 
then applying Equation 3. But no adequate general 
principle of transformation has been found. In the 
second place, when different types of schedules are 
arranged in the terminal links of concurrent chains, 
choice is likely to fail tests of strong stochastic transi- 
tivity (Navarick & Fantino, 1972, 1974, 1975). One 
implication of such intransitivities is that a general 
principle of transformation cannot be found. The 
issues here are thorny, and the interested reader can 
find a complete account in Fantino and Navarick 
(1974). For our present purpose it is sufficient to note 
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Fig. 9. The predictions of Equations 1, 2, and 3. In (a) the 
choice proportions (for the VI 30-sec terminal-link schedules) 
required by the three equations are plotted against the value 
of the equal initial links (VI X-sec). In (b) the choice propor- 
tions for the key with the VI 60-sec schedule in the initial link 
required by each of the three equations are plotted against the 
initial-link value on the other key (VI X-sec). (Adapted from 
Squires & Fantino, 1971. © 1971 by the Society for the Experi- 
mental Analysis of Behavior, Inc.) 


that the delay reduction hypothesis is consistently 
superior to the reinforcement density hypothesis in 
accurately describing choice. 

Up to now we have considered only the temporal 
duration of the terminal links (i.e., tg, and tog, or 
what we have called the interreinforcement interval 
or IRI). Events during the IRI also affect choice and 
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would probably affect observing in the same way, as 
we shall point out, though in most cases this has not 
been assessed. (In general, I predict that if one stim- 
ulus is preferred to another, it should maintain ob- 
serving responses which change a mixed schedule, 
composed of both stimuli, to the corresponding multi- 
ple schedule.) We consider these effects in the next 
section. 


Events During the Terminal Links 


Neuringer (1969), in an interesting concurrent 
chains experiment, compared choice between rein- 
forcement on an FI schedule and reinforcement after 
an interval in which the key was dark and inopera- 
tive. He found indifference between the FI and the 
response-independent schedules when the duration of 
the terminal links was equal. From his own and re- 
lated data (including Anger, 1956; Autor, 1960; Dews, 
1962, 1965; Herrnstein, 1964a; Killeen, 1968b; Neu- 
ringer & Schneider, 1968), Neuringer concluded that 
“these studies, together with the present findings, sug- 
gest the following hypothesis: the probability (or 
rate, or latency) of a response is controlled by the 
interval between that response and reinforcement (a) 
independently of the number of other responses inter- 
vening in the interval, and (b) independently of 
whether such intervening responses are required or 
prohibited” (p. 382). 

Others, including Rachlin and Herrnstein (1969) 
and Schneider (1972) have also suggested that or- 
ganisms should be indifferent between two terminal 
links whose overall durations are equal. In other 
words, given equal initial links, choice proportions 
should approximate .50 as long as the durations of 
the terminal links are also equal. This result would 
be consistent with all the models of choice we have 
considered thus far. For example, in Neuringer’s ex- 
periment the stimuli associated with each terminal 
link are correlated with identical reductions in time 
to reinforcement. It might seem surprising, however, 
if choice were so simply and directly dependent on the 
size of the terminal links (tj; and tj in Equations 
1-3) irrespective of events occurring during them. In- 
deed, there are important exceptions to this gen- 
eralization. ‘These are considered in this section. 


ADDITION OF STIMULI PAIRED 
WITH REINFORCEMENT 


Schuster (1969) reported an important study testing 
the notion that stimuli paired with primary rein- 
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forcement will serve as conditioned reinforcers. Brief- 
stimulus presentations paired with primary reinforce- 
ment at the end of each terminal link were made 
available (on an FR 11 schedule) during one of two 
terminal links. The brief-stimulus presentations were 
therefore superimposed on the terminal-link food 
schedule. The main dependent variable was the rate 
of choice responding in the concurrently available 
initial links. Rate of primary reinforcement in the two 
terminal links was identical and was available on VI 
30-sec schedules. Therefore, if the stimuli paired with 
food reinforcement were conditioned reinforcers, the 
pigeons should prefer to enter the terminal link with 
the superimposed schedule of brief stimuli, Schuster 
found that while the superimposed brief stimuli in- 
creased response rates in the terminal link, this termi- 
nal link “was eventually chosen less often. This result 
was consistent with a functional analysis, since, on the 
key with the superimposed schedules, the brief stimuli 
were correlated more often with nonreinforcement’”’ 
(p. 234). 

As Fantino (1969a), Gollub (1970), and Squires 
(1972) have observed, Schuster’s conclusion is rendered 
ambiguous by the potential effects of the brief-stimulus 
presentations on terminal-link response rates. In par- 
ticular, the high response rates generated by super- 
imposed brief stimuli might have created an aversion 
for that terminal link which might have masked any 
conditioned reinforcing effect. Squires (1972) repeated 
Schuster’s experiment in a way that avoids the con- 
founding factor present in the earlier experiment: 
whereas Schuster had scheduled the superimposed 
brief stimuli on an FR schedule, Squires scheduled 
them on 9 VI. Squires found tio consistent preterence 
or aversion for the schedule with superimposed brief- 
stimulus presentations. Thus when the high rates of 
terminal-link responding, confounded in Schuster’s 
work with brief-stimulus presentations, are eliminated, 
a preference for the schedule providing the brief stim- 
uli still fails to develop, 

It must be noted, however, that the superimposed 
brief stimuli are conditioned reinforcers only in the 
sense of the pairing hypothesis introduced and dis- 
missed earlier. They certainly are not reinforcers in 
the informational sense (as Schuster rightly pointed 
out), nor are they correlated with a reduction in aver- 
age time to reinforcement. Nor are they likely to 
maintain observing, though this has yet to be as- 
sessed. 

In any case, although Squires’s results are consis- 
tent with Neuringer’s conclusion that choice is un- 
affected by events occurring during the IRI, results 
from three other types of experiments are not. 
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‘THE EFFECTS OF REQUIRED 
RATES OF RESPONDING 


Fantino (1968) demonstrated that high response 
rate requirements (such as schedules differentially 
reinforcing high rates, or DRH) are apparently aver- 
sive, at least when compared with a simple FI that 
provides reinforcement at the same rate and on the 
same proportion of trials. Other studies, however, 
have failed to demonstrate the effects of response rates 
upon choice (for example, Herrnstein, 1964a; Killeen, 
1968b, 1971: Neuringer, 1969, discussed above). Moore 
and Fantino (1975) shed some light on these ostensibly 
conflicting findings. They noted that studies on the 
fesponse rate problem fell into three categories: (1) 
those using aperiodic schedules in which response 
rates were varied but particular response rates were 
not required (Herrnstein, 1964a; Killeen, 1968b)—for 
example, Killeen’s pigeons were indifferent between a 
VI schedule and a comparable response-independent 
schedule, although large differences in response rates 
were maintained by the two schedules); (2) those using 
periodic schedules in which response rates were varied 
but particular response rates were not required (Neu- 
ringer, 1969: Killeen, 1971); and (3) those in which 
particular response rates were required on periodic 
schedules (Fantino, 1968). Moore and Fantino ex- 
amined preference for additional schedules falling 
into each of the three categories and replicated the 
prior results; only when particular response rates 
were required on periodic schedules did subjects 
choose the schedule without the response rate require- 
ment. In addition, choice was unaffected when par- 
ticular response rates were required on aperiodic 
schedules. These results suggested that response re- 
quirements influence choice to the extent that re- 
sponses must be emitted during discriminable periods 
of nonreinforcement, as in the early portion of a 
periodic schedule. Moore and Fantino confirmed this 
notion by showing that pigeons preferred a periodic 
response-independent schedule to a periodic, response- 
dependent schedule that included a requirement to 
respond early in the terminal link even though re- 
sponding produced reinforcement only later. It is 
likely that observing would also be maintained by the 
stimulus of the response-independent schedule, i.e., 
that subjects would respond to change a mixed into a 
multiple schedule if one component required responses 
during discriminable periods of nonreinforcement. 


THE EFFECTS OF STIMULUS SEQUENCES 


Fantino (1969a) suggested that if two schedules of 
equal duration were segmented differently, the or- 
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ganism would prefer the one composed of fewer dis- 
criminable components. In other words, a simple FI 
2X should be preferred to a chain FI X FI X schedule. 
Duncan and Fantino (1972) support this suggestion, 
showing that choice for a schedule is substantially 
reduced by the chaining operation. Their procedure 
is schematized and their main finding, preference for 
simple FI schedules over two-link chains with equiva- 
lent durations, is shown in Figure 10. Note that the 
obtained terminal-link durations were close to the 
scheduled ones (Xs close to .50 in Figure 10b), espe- 
cially with the longer terminal links, for which the 
largest preferences were found. Thus these results 
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Fig. 10. (Top) Pictorial representation of the experimental 
procedure used by Duncan and Fantino (1972). The left portion 
of the figure indicates the sequence of events when responses 
on the left key were reinforced; the right portion indicates the 
sequence of events when responses on the right key were rein- 
forced. ‘The terminal links consisted of a simple FI schedule on one 
of the keys and a chain FI FI schedule on the other key. (Bot- 
tom) The mean choice proportions for each bird on the FI key 
as a function of the size of the intervals in the terminal links. 
The x’s indicate the relative rate of reinforcement on the FI 
key. (Adapted from Duncan & Fantino, 1972. © 1972 by the 
Society for the Experimental Analysis of Behavior, Inc.) 
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cannot be explained in terms of reinforcement rates. 
Duncan and Fantino (1972) concluded that prefer- 
ence for the simple FI could be attributed either to 
the additional stimulus or to the additional response 
requirement associated with the chain FI FI, but that 
whether the results were due to stimulus or response 
aspects of the chaining operation (or both), the results 
pointed to the insufficiency of reinforcement rates (or 
delay reductions) in determining choice. In order to 
tease apart the role of stimulus and response aspects 
of the chaining procedure, one portion of an experi- 
ment by Wallace (1973) assessed choice between chain 
FT X FT X and a simple FT 2X schedule. If stimulus 
aspects were responsible for the large preferences ob- 
tained by Duncan and Fantino, then preferences 
should be comparable in this procedure which is 
identical except for the absence of a response require- 
ment in the terminal links. On the other hand, uf 
response factors were sufficient to explain the large 
preferences obtained by Duncan and Fantino, then 
the results should reveal indifference between the 
chain and the simple FT schedules. Wallace found a 
clear and consistent preference for the simple FT 
schedule. Nonetheless, preferences were far smaller 
than they had been in the Duncan and Fantino study. 
This result suggests that both stimulus and response 
factors were responsible for the large preferences ob- 
tained by Duncan and Fantino (and replicated in a 
different portion of the Wallace study). It is possible, 
of course, that response factors are effective only inso- 
far as they increase the degree of stimulus control. 
Taken together, these results suggest that choice in 
the concurrent chains procedure may not be accurately 
described by considering simply the relative size of 
the terminal links, if the terminal links are segmented 
and are arranged according to periodic (as opposed to 
aperiodic) schedules of reinforcement. Apparently, 
the initial-link stimulus of a chain FI FI or chain FT 
FT schedule is a more effective signal that the or- 
ganism is temporally distant from primary reinforce- 
ment than is the stimulus associated with an FI or FT 
schedule, even though the durations and, therefore, 
the average delays to primary reinforcement are 
equal. Apparently, the subjective delay (or the “psy- 
chological distance’) to primary reinforcement is 
greater in the segmented schedule. It should be ac- 
knowledged that these results not only limit the scope 
of the delay reduction hypothesis (as well as that of 


6 Schneider (1972) found no difference between chain and 
tandem VI schedules. The difference between his finding and 
that of Duncan and Fantino (1972) and Wallace (1973) is 
probably due to his use of VI schedules. For a discussion, see 
Duncan and Fantino (1972). 


336 


other formulations of choice in the concurrent chains 
procedure), but are consistent with the pairing hy- 
pothesis, dismissed as inadequate earlier, in the fol- 
lowing sense: contiguity with primary reinforcement 
may sometimes enhance the effectiveness of a condi- 
tioned reinforcer. In terms of observing responses, 
these results suggest that subjects would respond to 
convert mixed FI 2X (chain FI X FI X) or mixed FT 
2X (chain FT X FT X) schedules to their multiple- 
schedule equivalents. Again, these experiments have 
not been done. 


D1FFERENTIAL DISCRIMINATIVE STIMULI 
IN THE TERMINAL LINKS 


Several experimenters have shown that organisms 
prefer multiple schedules of reinforcement to equiva- 
lent mixed schedules (e.g., Bower et al., 1966; Hendry, 
1969b; Hursh & Fantino, 1974). Both Bower et al. and 
Hendry found uniformly large preferences for the 
multiple schedules. But responding in the initial links 
in their experiments was reinforced on FR schedules 
which, unlike concurrent VI schedules, are likely to 
lead to exclusive responding to one of the concurrent 
stimuli; this in turn means that most trials ended 
with the multiple-schedule terminal link. This bias 
may have contributed to the magnitude of the prefer- 
ence for the multiple schedule. Hursh and Fantino 
(1974) obtained reliable but much smaller preferences 
for the multiple schedule with VI 60-sec schedules in 
the choice phase and showed that the degree of prefer- 
ence could be manipulated by varying the size of the 
initial-link schedules (i.e., the shorter the choice 
phase, the larger the preference). 

This preference for multiple over mixed schedules 
can be understood in terms of the well-established 
finding that the presence of short interreinforcement 
intervals has a disproportionate influence on choice 
(Davison, 1969, 1972; Fantino, 1967; Herrnstein, 1964b: 
Hursh & Fantino, 1973; Killeen, 1968a; Navarick 
& Fantino, 1975). In mixed schedules of reinforce- 
ment, no stimuli correlated with a short delay occur 
at the beginning of the terminal links, even on that 
half of the trials in which the shorter delay is sched- 
uled. On multiple schedules, on the other hand, stim- 
uli correlated with the short delay are immediately 
available on half of the trials. The occurrence of these 
stimuli are responsible for preference of the multiple 
schedule over the mixed schedule. 

The results reviewed here suggest that even when 
different terminal links provide the same distribu- 
tion of interreinforcement intervals, subjects will 
choose the link in which different stimuli are cor- 


CONDITIONED REINFORCEMENT: CHOICE AND INFORMATION 


related with differing delays of reinforcement. Such 
preference, and its interpretation in terms of the rein- 
forcing strength of stimuli correlated with positive 
outcomes, is consistent with the data and theory on 
observing responses, reviewed earlier. 


SUMMARY 

The results considered in the last three sections all 
suggest that while terminal-link duration (the inter- 
reinforcement interval or IRI) is a crucial determi- 
nant of choice, events during the IRI must also be 
considered in any complete account of choice in the 
concurrent chains procedure. Stated in terms of con- 
ditioned reinforcement on periodic schedules, the 
results from the section on events during the IRI sug- 
gest that the strength of a stimulus as a conditioned 
reinforcer depends not only on the reduction in 
average delay to reinforcement correlated with its 
onset but also on whether response rate is con- 
strained in its presence, on the number of additional 
stimuli intervening between it and primary reinforce- 
ment, and on the presence of differential discrimina- 
tive stimuli in the terminal links (as in the mixed- vs. 
multiple-schedule comparison). Results from each of 
the three areas reviewed have been firmly established 
only with periodic (FR, FI, and FT) schedules, as 
opposed to aperiodic (VR, VI, and VT) schedules, 
suggesting the following: since clearly discriminable 
periods of nonreinforcement are not present in aperi- 
odic schedules, differences between aperiodic schedules 
are likely to be less salient and to have less effect upon 
choice than comparable differences between periodic 
schedules. ‘Thus while the delay reduction hypothesis 
of conditioned reinforcement is consistent with all of 
the data from the observing response and concurrent 
chains literature when only the size of the IRI is 
manipulated, some events occurring during the IRI 
must also be considered for a complete account of 
choice. It is likely that these same events would have 
comparable effects on observing responses. 


CONCLUSIONS 


The research on observing behavior suggests that 
information per se is not reinforcing. While drasti- 
cally modified versions of the information hypothesis 
—stressing “‘good news’ only—can be made to fit most 
of the observing response data, such accounts are often 
indistinguishable from conditioned reinforcement hy- 
potheses such as the delay reduction hypothesis and 
the reinforcement density hypothesis. Moreover, the 
information hypotheses all seem to lack any specifica- 
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tion of the importance of temporal variables, which 
are critical for the maintenance of operant behavior. 

A more viable possibility is that organisms will re- 
spond to produce stimuli correlated with a reduction 
in delay to primary reinforcement (whether or not the 
observing response actually affects time to reinforce- 
ment). Studies of choice behavior with the concurrent 
chains procedure are also consistent with this delay 
reduction hypothesis: when an organism chooses be- 
tween two stimuli correlated with different reductions 
in delay to primary reinforcement, its choice of either 
stimulus is a monotonic function of the relative re- 
duction in average time to reinforcement correlated 
with that stimulus. If a stimulus is not correlated with 
a reduction in delay to primary reinforcement, it will 
not maintain a significant amount of responding in 
either the observing response or concurrent chains 
procedures—i.e., the stimulus will not be a condi- 
tioned reinforcer. 
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Conditioned Suppression 
and the Effects 


of Classical Conditioning 


on Operant Behavior 


INTRODUCTION 


A defining feature of studies of classical condition- 
ing is that the delivery of stimuli and the relationship 
between them are controlled by the experimenter in- 
dependently of a subject’s behavior (Black & Prokasy, 
1972, p. x1). Pavlov (1927) was, of course, the first sys- 
tematically to investigate the effects of such pro- 
cedures on behavior. He presented one event (the 
conditioned stimulus: CS) regularly in a fixed tem- 
poral relationship with a second event (the uncondi- 
tioned stimulus: US) which reliably elicited a re- 
sponse (the unconditioned response: UR). He found 
that eventually the CS came to produce behavior (the 
conditioned response: CR) which was similar to the 
UR. Pavloy’s measures of this conditioning of an 
acquired reflex were simple but adequate. For exam- 
ple, when the UR was salivation caused by an irri- 
tant placed on the tongue, the CR was measured in 
terms of the number of drops of saliva resulting from 
the presentations of the CS. In this way conditioned 
responses were measured as they developed from a 
zero value on the first presentations of the CS to an 
asymptotic value when the stimulus was repeatedly 
presented with the US. 
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This chapter considers the effects of procedures of 
this nature in which events which signal the delivery 
of a US such as food or a shock are presented inde- 
pendently of an animal’s behavior. However, the dis- 
cussion is confined to experiments in which the effects 
are measured by the changes they produce in behavior 
which is maintained by response-dependent reinforce- 
ment. ‘Che dependent variable in such studies is pro- 
vided by a comparison of the frequency with which 
operant responses are emitted during the Pavlovian 
CS and in its absence. A typical experimental situa- 
tion is shown in diagramatic form in Figure 1. The 
top line shows the presentation of a continuous stim- 
ulus (e.¢., a light) which is associated with a schedule 
of reinforcement. ‘The fourth line shows the operant 
responses (e.g., lever presses) which are emitted during 
this discriminative stimulus, and the bottom line de- 
picts the delivery of resultant response-dependent rein- 
forcers according to an intermittent schedule, such as 
a variable-interval schedule. During the discriminative 
stimulus, and therefore while the subject is emitting 
operant responses, a second stimulus, such as a tone, is 
presented: this is shown in the second line of Figure 1. 
‘This second stimulus signals the delivery of a Pav- 
lovian US such as electric shock (third line). The sec- 
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Discriminative stimulus 
Pavlovian CS 


Pavlovian US 


mt 


Operant responses 


Reinforcers 


Time 


Fig. 1. Diagram representing typical experimental arrangements 
in the studies discussed in this chapter. 


ond and third lines therefore depict in this case a 
typical Pavlovian delayed conditioning procedure, 
which is superimposed on a situation in which op- 
erant behavior is maintained by occasional reinforce- 
ment. The only behavioral measure in this experi- 
ment is provided by the operant responses depicted in 
Figure 1. This is the case in most of the experiments 
discussed in this chapter, although in some studies 
additional measures are taken of autonomic activity 
(respondent behaviors) which occur during the CS or 
after the US, such as changes in heart rate. 

The experiments to be discussed differ from con- 
ventional classical conditioning experiments, then, in 
two important ways. First, the behavioral measures are 
provided by operant rather than by traditional re- 
spondent beliavior. Second, when the Pavlovian CS is 
presented, the subject is at that time emitting a pat- 
tern of behavior which is recognized (and controlled) 
by the experimenter. These differences in emphasis 
have not, however, prevented the collection of data 
of considerable importance for classical conditioning. 
Moreover, such experiments have brought a number 
of theoretical issues into sharp focus. 

Clearly, there are many different interactions which 
may be studied within the general procedure discussed 
above. As Rescorla and Solomon (1967) have pointed 
out, the operant behavior may be maintained by a 
schedule of positive or of negative reinforcement, and 
the Pavlovian US may be either appetitive or aversive. 
These various interactions will be reviewed in this 
chapter. However, most of the research conducted in 
this area has studied the effects of an aversive US 
(specifically, shock) with the Pavlovian procedure in- 
troduced when operant behavior is maintained by 
schedules of food or water reinforcement. It is here 
that the most developed theoretical implications are 
to be found, and it is this area of research and discus- 
sion to which we therefore turn first. No attempt will 
be made in this chapter to provide a comprehensive 
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review of the truly vast body of literature reporting 
experiments in which Pavlovian procedures have been 
superimposed on operant behavior. The discussion is 
deliberately and strenuously selective, in the hope that 
general principles and problems may emerge more 
directly and more clearly. 


THE ESTES-SKINNER PROCEDURE 
AND THE MEASUREMENT OF 
ITS EFFECTS 


In 1941] Estes and Skinner reported the results of a 
study in which they exposed rats to an intermittent 
schedule of food reinforcement which would now be 
described as fixed-interval (FI) 4 min. When the rats’ 
lever-pressing behavior had stabilized, a tone was pre- 
sented for a period of 3 min (5 min in later condi- 
tions). As each period of tone ended, an unavoidable 
and inescapable shock was delivered to the rats through 
the grid floor. The delivery of both tone and shock 
was programmed independently of the rats’ behavior, 
and the temporal relation between them makes it 
possible to term the procedure classical conditioning: 
the tene is thus a Pavlevian GS and the sheck is a US, 
Unfortunately, the intensity of the shock was not re: 
ported in this early paper, but it must have been rela- 
tively mild because Estes and Skinner mentioned that 
it produced no noticeable disturbance in operant be- 
havior when it was first delivered. However, as the 
repeated pairings of tone and shock continued, be- 
havior during the tone became disrupted. The rate of 
responding during the tone fell until it was about 
one-third the rate during the same period “in control 
experiments.” This is illustrated by cumulative rec- 
ords taken during the experiment, which show clearly 
the decrease in operant response rate during the period 
of tone in comparison with rates before and after the 
tone. The general finding that food-reinforced operant 
behavior decreases in frequency during a preshock 
stimulus has since been widely replicated in many 
different experimental conditions. The effect is 
sometimes called conditioned suppression (see detailed 
reviews by Davis, 1968, and Lyon, 1968). It is illus- 
trated by the segments of cumulative record shown 
in Figure 2 (from Blackman, 1974) which show the 
operant behavior of a rat exposed to a variable- 
interval (VI) 30-sec schedule of food reinforcement, 
the delivery of which is shown by brief hatchmarks in 
the usual way. In the middle of successive 7-min peri- 
ods, an auditory stimulus was presented for 1 min, 
during the whole of which time the pen on the cumu- 
lative recorder was deflected downward, although it 
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could still be stepped across the paper by an operant 
response. 

At the end of each of these J-min periods, a very 
brief shock was delivered. The record at the top of 
Figure 2 shows that the responding maintained by the 
VI schedule was completely eliminated during the 
preshock signal, for the record is horizontal during 
the I-min deflection. Immediately after the shock, the 
animal resumed the steady rate of operant responding 
maintained by the schedule. The lower segment of the 
record shows that during another two I-min periods 
of the noise the rat did not stop responding com- 
pletely, although the response rate was lower during 
the noise than in its absence. Since the VI schedule 
remained in Speratisn durvitig these periods, responses 
might still eccasionally be followed by a reinforcer 
(shown by a brief upward hatchmark on the record). 
figure J thergtore depicts complete conditioned sup- 
pression (top) and partial conditioned suppression 
(Bottom) of operant behavior during a CS (noise) 
terminated by a US (shock). 

In this area of research, it has been generally ac- 
cepted that an appropriate mcasurcment is a com- 
parison between response rates during a CS and in its 


O 
absence rather than the absolute reduction of response 


fi“ 


7 minutes 


200 responses 


Fig. 2. Cumulative records illustrating the effects on operant 
behavior of a stimulus ending with an unavoidable shock (con- 
ditioned suppression). The response pen is offset during the 
preshock stimulus; hatchmarks denote reinforcement. 


rates during the CS. This measurement is achieved by 
means of a suppression ratio or inflection ratio, but 
unfortunately there has been no general agreement 
about the best way to make this calculation. One sim- 
ple method is to present the rate during the CS as a 
direct proportion of a control rate of responding (e.g., 
Stein, Sidman, & Brady, 1958). In this case, complete 
conditioned suppression is depicted by a ratio of 0; if 
there is no effect on response rate attributable to the 
CS, the suppression ratio is 1.0, and if the response 
rate increases during the CS, the ratio is greater than 
1.0. However, another widely used ratio (e.g., Kamin, 
1965; Rescorla, 1968) expresses the rate during the 
CS as 4 proportion of the sum of control and CS re- 
sponse rates, This results in a figure of 0 for complete 
suppression, .5 for no suppression, and greater than .5 
for acceleration of responding during a CS. The use 
of these different calculations. and of others, is a po- 
tential confusion in this area of research (see Lyon, 
1968). Moreover. as will be discussed later, the use of 
any relative rate measure such as these is not without 


its problems (Lea & Morgan, 1972), 


INVESTIGATIONS OF CLASSICAL 
CONDITIONING PARAMETERS 


Many investigators have found that the amount of 
conditioned suppression (however the suppression 
ratio is calculated) is a function of conventional 
parameters in classical conditioning (Davis, 1968). ‘To 
mention two simple examples, Annau and Kamin 
(1961) showed that the amount of conditioned sup- 
pression in rats is an increasing monotonic function 
of the intensity of the shock US, and Kamin and 
Schaub (1963) have shown a similar effect of the in- 
tensity of the GS. Rescorla and Solomon (1967) have 
suggested that “it might very well turn out that in- 
strumental responding is as sensitive, or perhaps even 
more sensitive, a measure of the effects of Pavlovian 
conditioning procedures than are the traditionally 
measured conditioned visceral or motor reflexes them- 
selves.” Although Rescorla and Solomon considered 
this possibility “somewhat ironic,” it is certainly true 
that the conditioned suppression paradigm has been 
widely and successfully used in order to develop our 
understanding of classical procedures in general. Re- 
viewing a great deal of such work carried out in his 
own laboratory, Kamin (1965) has claimed that “we 
are measuring respondent behavior indirectly, with a 
surprising quantitative sensitivity.” It is not possible 
to review here the large body of literature in this tra- 
dition. However, as one example of current work, 
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Rescorla’s investigations of the necessary and sufficient 
conditions for a stimulus to become a classical condi- 
tioned stimulus may be cited. 

Rescorla’s important work grew out of his theo- 
retical discussion of the appropriate control pro- 
cedures for Pavlovian conditioning (Rescorla, 1967). 
In this paper, he suggested that conventional pro- 
cedures did not allow for the measurement of appro- 
priate base lines against which to assess accurately the 
strengths of a conditioned response developed by 
Pavlovian procedures, For example, in some experi- 
ments the stimulus which becomes the conditioned 
stimulus has first been presented in such a way that it 
is explicitly never paired with the unconditioned 
stimulus. The temporal contingency between the CS 
and the US is then introduced in the conditioning 
phase of the experiment. Rescorla argued that tradi- 
tional control procedures such as this fail to provide 
an unconfounded measure of the effects of the experi- 
mental contingency between the stimuli. He suggested 
that the only way in which this could be achieved was 
by means of what he termed a “truly random” control 
procedure. With this control procedure, the stimulus 
which is to become the CS and the US are first pre- 
sented at the frequencies to be used in the condition- 
ing phase, but entirely independently; it is therefore 
possible for them to be occasionally presented together 
by chance. Hence occasional contiguity between the 
two stimuli may occur, but no reliable contingency 
exists between them at this stage of the experiment. 

Rescorla’s empirical work has subsequently devel- 
oped these ideas. He has shown, for example, that 
mere contiguity between two stimuli is in fact not 
sufficient for Pavlovian conditioning to develop. If a 
stimulus is to become a CS and thus elicit a CR, it 
now seems that it must, in simple terms, provide a 
subject with “information” about the occurrence of 
the US. In his truly random control, the occurrence of 
a CS provides no information about the occurrence of 
a US, for the probability of a US is the same both 
when a CS is presented and when it is not. On the 
other hand, in traditional delayed conditioning ex- 
periments, the occurrence of a CS provides informa- 
tion that a US is about to be presented; moreover, no 
US is presented without a preceding CS. One of 
Rescorla’s elegant experiments (1968) has used the 
conditioned suppression paradigm to investigate the 
effects of these and of various intermediate relation- 
ships between a CS and a US, and its results are 
summarized in Figure 3. The CS was a 2-min presen- 
tation of a tone, and the US was shock. Small groups 
of rats were exposed to various relationships between 
these stimuli. The probability of a shock in a period 
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Fig. 3. Median suppression ratios for groups of rats as a func- 
tion of the probability of shock in the presence and absence of 
the CS. A ratio of 0.00 denotes complete suppression, and .50 
shows that the CS has no effect. (From Mackintosh, 1974, after 
Rescorla, 1968.) 


of tone was specified for different groups as .40, .20, 
.10, or 0 (i.e., no shocks were delivered). Within these 
groups, subgroups of animals were exposed to vary- 
ing probabilities of shock in periods when the tone 
was absent. Rescorla then measured the effects of the 
tone after these training conditions by superimposing 
the tone (without shocks) on food-reinforced operant 
behavior and measuring its disruptive effects. The de- 
pendent variable was expressed as a median suppres- 
sion ratio, this being calculated by the formula which 
yields a ratio of 0 for complete conditioned suppres- 
sion and .5 for no disruptive effect. Figure 3 shows the 
effects of the tone on the first day on which it was 
superimposed on the operant behavior of the various 
groups (four presentations for each rat). If the prob- 
ability of a shock was the same both in the presence 
and the absence of a tone (Rescorla’s truly random 
procedure), the tone had no suppressive effect. Thus 
suppression ratios are consistently at a value of ap- 
proximately .5 whether the probability of a shock in 
the presence and in the absence of the tone was .40, 
-20, or .10. On the other hand, if shocks had initially 
occurred only during a period of tone (so that the 
probability of shock in the absence of the tone was 
0), the tone caused relatively severe suppression of 
operant behavior. The amount of this suppression in- 
creases with greater probabilities of shock during the 
tone, for when the probability of shock in the tone 
was .10, .20, or .40, suppression ratios were approx- 
imately .20, .10, and 0 respectively. Figure 3 shows, 
then, that the tone suppressed operant behavior to 
the extent that it had been differentially associated 
with the occurrence of a shock, that is, in proportion 
to the difference between the probability of shock 
during the tone and its probability in the absence of 
the tone. The degree of classical conditioning in- 
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creased as the probability of the US during the CS 
became greater than the probability of the US in the 
absence of the CS, and not simply as the former value 
increased. 

Rescorla has developed his account of the neces- 
sary and sufficient conditions for classical conditioning 
far beyond the basic idea indicated above (see, for ex- 
ample, Rescorla, 1969; Rescorla & Wagner, 1972), but 
it is not necessary to provide a more complete de- 
scription here. Rescorla’s important work on classical 
conditioning is based closely on his use of the con- 
ditioned suppression procedure. ‘Chis procedure makes 
it possible to measure behavior (in the form of op- 
evant response rates) throughout both non-Cs and Cs 
periods, whatever the probability of the US in cither 
of them. The demande for 4 sensitive and reliable 
depensent variable to show the ellects of the quantita- 
tive differences in Rescorla’s independent variables is 
met by the indirect measurement of classical condi- 
tioning threugh the frequency of eperant responses. 

Our understanding of classical conditioning pro- 
cedures has therefore been significantly advanced by 
studies of their effects on operant behavior. The ex- 
amples discussed here are representative of a very 
large body of research which has been carried out 
within this general strategy, and they illustrate its 
contemporary impact. The procedure has consistently 
proved to be robust. reliable, and sensitive, and so 
any inherent ironies may surely be readily tolerated 
by researchers in the field of classical conditioning. 


INVESTIGATIONS OF OPERANT 
CONDITIONING PARAMETERS 


In the research discussed in the previous section, 
the emphasis was placed on the way in which the 
processes of classical conditioning may be investigated 
by means of conditioned suppression procedures. ‘The 
operant behavior which provides the only dependent 
variable in these studies is usually maintained by a 
simple schedule of intermittent reinforcement, Typ- 
ically, a variable-interval schedule is used fer this 
purpose, for, of course, such schedules maintain op- 
erant behavior over considerable periods of time, and 
the generally moderate and consistent rates of re- 
sponding which they generate make it easy for the 
experimenter to measure any suppressive effect of a 
CS (as in Figure 2, for example). Also, of course, par- 
tial suppression of behavior maintained by a variable- 
interval schedule may have only minimal effects on 
the frequency of reinforcement obtained. If the op- 
erant behavior of the subjects in various experimental 


groups is controlled by an identical procedure, we 
have seen that it is indeed possible to further the 
analysis of the necessary and sufficient conditions for 
classical conditioning to occur and to measure its 
strength as a function of specified independent vari- 
ables. However, the amount of conditioned suppres- 
sion during a preshock stimulus is not determined 
solely by such Pavlovian variables as the parameters 
of the conditioned and unconditioned stimuli and the 
contingencies between them. In this section other 
important variables are discussed which are related to 
the maintenance of the operant response on which 
the classical conditioning procedure is imposed. 

Anything which affects the nature or strength of 
opcrant behavior may also affect the amount of dis- 
ruption produced by a specified conditioned stimulus 
when it 16 superimposed on that behavior, This is 
perhaps not surprising, since conditioned suppression 
can be regarded as the result of pitting classical against 
operant conditioning effects. An important study 
which emphasizes this was reported by Stein, Sidman, 
and Brady (1958), who investigated the effects of vary- 
ing the duration of a preshock stimulus through a 
range of 30 sec to 50 min and also examined the 
effects of varying the interval between successive stim- 
ulus presentations. Considerable variation was found 
in the amount of conditioned suppression produced 
by different combinations of these variables. However, 
neither of them proved to be a critical determinant in 
itself: instead, there was a high negative correlation 
between the amount of suppression and the relative 
duration of the preshock stimulus, 1.e., the proportion 
of time in any session during which the CS was pres- 
ent. In considering ways in which this somewhat ab- 
stract temporal value might control the amount of 
conditioned suppression, Stein et al, noted that the 
behavior of the rats in their study was suppressed only 
to the extent that they did not thereby miss more 
than 10% of the total number of reinforcements which 
could be set up by the VI schedule. So, when a pre- 
shock stimulus (of any duration) was present for a 
relatively high proportion of the experimental ses- 
sion, complete suppression of operant behavior would 
have produced a substantial reduction in the number 
of reinforcements obtained. In these situations, only 
partial suppression of behavior was observed during 
the preshock stimulus. However, if a relatively short 
preshock stimulus was presented only rarely, the sub- 
jects could “afford” to suppress completely and yet 
still obtain at least 90% of the total possible rein- 
forcers, and indeed complete suppression was observed 
in such situations. 

Carlton and Didamo (1960) reported a study based 
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on that of Stein et al. (1958), but they varied the 
length of their experimental sessions so that the num- 
ber of reinforcers actually obtained by subjects was 
constant throughout the various conditions of the 
experiment. Again it was found that the amount of 
conditioned suppression decreased as the relative 
duration of the preshock stimulus increased. Carlton 
and Didamo suggested that this reduction in the sup- 
pressive effect was due to “changes in response output 
which minimise the decline in reinforcement rate.” 
This suggestion implies that behavior which is rein- 
forced only occasionally will be less resistant to the 
suppressive effects of a preshock stimulus, for a “de- 
cline in reinforcement rate” resulting from suppres- 
sion during a fixed preshock stimulus might not be 
so readily detected. An experiment by Lyon (1964) 
supports this hypothesis. Using pigeons as subjects, 
Lyon superimposed a preshock stimulus on both com- 
ponents of a multiple schedule in which two fre- 
quencies of reinforcement were programmed (muli 
VI I-min VI 4-min). It was found that the pigeons’ 
behavior was more suppressed during the preshock 
stimulus when it was superimposed on the lower fre- 
quency of reinforcement than when it occurred dur- 
ing the component with the higher reinforcement 
frequency. Lyon therefore suggested that behavior 
which is reinforced relatively frequently is more re- 
sistant to disruption by a conditioned suppression 
procedure than is behavior which is reinforced only 
rarely. ‘This has been corroborated by Blackman 
(1968b), who used response-pacing procedures (Ferster 
& Skinner, 1957) which controlled responding at ap- 
proximately equal rates in two components of a multi- 
ple schedule, but in which the frequencies of rein- 
forcement were controlled by two different VI sched- 
ules. This made it possible to identify the effects of 
reinforcement frequency on conditioned suppression 
more unequivocally than did Lyon’s (1964) study, for 
in the latter the different frequencies of reinforcement 
produced different control response rates, a possible 
confounding effect. 

If the conditioned suppression phenomenon is con- 
ceptualized as the outcome of a competition between 
a classically conditioned response with a fixed strength 
and the tendency to emit operant responses which are 
occasionally reinforced, the relative resistance of be- 
haviors which result in frequent reinforcement may 
not seem surprising. Less predictable, however, is the 
finding that, when reinforcement frequency is con- 
trolled, rates of responding are differentially suscepti- 
ble to conditioned suppression during a preshock 
stimulus. Blackman has shown in a series of experi- 
ments (1966, 1967, 1968b) that high rates of respond- 
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ing are more suppressed during a preshock stimulus 
than are lower rates which obtain the same frequency 
of reinforcement. ‘This conclusion was prompted by 
suppression ratios, but since these are relative mea- 
sures of the responding during a preshock stimulus, it 
seemed possible that the differences in suppression 
ratio might have been merely artifacts of the different 
base line response rates. For example, if the absolute 
number of responses emitted during a preshock stim- 
ulus was constant whatever the base line response 
rates, then suppression ratios would inevitably sug- 
gest less suppression in the condition in which the 
preshock stimulus was superimposed on the lower 
rate. ‘his was not the case, however. For example, in 
many conditions the absolute response rates during a 
preshock stimulus were higher when it was superim- 
posed on the lower control rate than when it was 
superimposed on the higher control rate (Blackman, 
1968b, ‘Table 6). So, for example, one subject (rat 1) 
emitted 89 responses per min in the control conditions 
of one component (A) of a multiple schedule and 36 
responses per min in the control conditions of the 
other component (B). The schedules in both compo- 
nents provided 2 reinforcements per min on average, 
but different response-pacing requirements were in 
operation in each component. During a l-min period 
of tone which ended with a .5-mA shock the suppres- 
sion ratios (response rate during CS divided by re- 
sponse rate in absence of CS) for this rat were .04 on 
component A and .60 on component B. These ratios 
reflected a mean rate of 5 responses per min in the 
preshock stimulus when it was superimposed on com- 
ponent A (high control rate) and 22 responses per min 
during the same preshock stimulus when it was super- 
imposed on component B (lower control rate), Hence 
the relative differences in control response rates were 
reversed during the preshock stimulus, which supports 
the view based on suppression ratios that lower rates 
of responding are more resistant to disruption by a 
preshock stimulus than are higher rates. 

The amount of conditioned suppression depends in 
part, then, on the frequency of reinforcement and on 
the rate of operant responding. The effects of classical 
conditioning procedures may therefore be dependent 
on the schedule of reinforcement on which they are 
superimposed. The importance of schedules in deter- 
mining the behavioral effects of other independent 
variables requires no emphasis here, for it has been 
demonstrated in many other contexts, such as the 
effects of drugs (e.g., Kelleher & Morse, 1968) and the 
effects of unsignaled aversive stimuli (e.g., McKearney, 
1972). 


The effects of conditioned suppression procedures 
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have now been investigated with all the principal 
schedules of reinforcement. Lyon and Felton (1966a) 
studied pigeons’ behavior maintained by variable-ratio 
(VR) schedules. They had expected that as the mean 
ratio requirement was increased from 50 to 100 to 
200, the subjects would show more conditioned sup- 
pression, because the overall frequency of reinforce- 
ment would fall. In fact, however, the results of their 
experiment were inconclusive, for they found that the 
behavior maintained by all the VR schedules was 
quite insensitive to the conditioned suppression pro- 
cedure. ‘his may be because reinforcements were 
contingent upon the continued and sustained emission 
of responses with this schedule in a way that is not the 
case with variable-interval schedules. Fantino (1973) 
has pointed out that partial suppression during a pre- 
shock stimulus superimposed on a variable-interval 
schedule can have, within limits, virtually no effect on 
the rate of reinforcement. This is clearly not the case 
with ratio schedules. Fantino therefore regards the 
results of Felton and Lyon as being “readily inter- 
pretable.” However, Blackman (1966) reported an 
experiment using rats as subjects in which VR 100 
behavior was far from resistant to conditioned sup- 
pression: all three rats showed virtually complete con- 
ditioned suppression when the shocks (.5 mA, .2 sec) 
were introduced. Another three animals were “yoked” 
to these first three, i.6., reinforcements were made 
available to them by the delivery of reinforcements 
to the VR animals, so that they were in effect on a VI 
Schedule with a mean interreinforcement interval 
identical to that of the ratio animals. These VI an- 
imals did not show such severe conditioned suppres- 
sion, which emphasizes the susceptibility of the ratio 
animals to conditioned suppression in this experi- 
ment, In contrast to the resistance shown by Lyon 
and Felton’s pigeons. The reasons for these incon- 
sistencies remain obscure; one hesitates to invoke 
species differences, especially as pigeons and rats ap- 
pear to be used interchangeably in other studies of 
conditioned suppression. Perhaps Fantino’s (1973) 
suggestion that “it would have been interesting [in the 
Lyon and Felton study] to note whether conditioned 
suppression would have been obtained with high 
shock intensities” is useful, for it is possible that the 
shocks used by Lyon and Felton would not have sup- 
pressed other patterns of behavior in these pigeons. 
The effects of conditioned suppression procedures 
have also been investigated with fixed-ratio and fixed- 
interval schedules. Lyon (1964) found that the effects 
of a preshock stimulus superimposed on FR 150 be- 
havior in pigeons depended on how far the bird was 
advanced in its sequence of behavior when the 


stimulus was presented. If this occurred just after 
reinforcement, Lyon observed complete suppression 
during the stimulus. If, when the stimulus was intro- 
duced, the bird had emitted more than 60 responses 
in the required sequence of 150, it continued respond- 
ing until the next reinforcement and then suppressed 
completely until the end of the stimulus. If the stim- 
ulus began when the bird was between 20 and 60 
responses into the required sequence, immediate sup- 
pression was sometimes seen and on other occasions 
the animal continued to respond until the next rein- 
forcement was obtained. The initial resistance to sup- 
pression when the bird had completed more than 60 
responses may perhaps be taken as support for Lyon 
and Feiton’s (1966a) report that variable-ratio be- 
havior in pigeons is resistant to suppression, for with 
the variable schedule the imminence of the next rein- 
forcement may always be as close as in those condi- 
tions of Lyon’s fixed-ratio experiment when the 
behavior was found to be resistant to suppression. Simi- 
larly, with variable-ratio schedules, the birds do not 
show postreinforcement pauses as they do on FR 150. 
Another study by Lyon and Felton (1966b) found that 
birds exposed to FR 25 (and to a lesser extent FR 50) 
did begin to respond again after a reinforcement had 
been obtained during a preshock stimulus. The birds 
therefore often obtained several reinforcements dur- 
ing the stimulus, and so to that extent could also be 
described as resistant to conditioned suppression. 
The crucial relationship between the onset of a 
preshock stimulus and the imminence of reinforcement 
has also been suggested with fixed-interval schedules. 
For example, Blackman (1968a) discussed the behavior 
of one subject (rat 1) which was exposed to a FI 20- 
sec schedule. A I-min preshock stimulus was presented 
in such a way that the next reinforcement became 
available 5 sec after its onset (and, therefore 25 and 
45 sec after the onset). When the shock which ended 
this stimulus was of .5 mA for .2 sec, the rat responded 
long enough into the stimulus to obtain the first two 
of these reinforcements (i.e., for 25 sec), and then sup- 
pressed completely until the end of the stimulus. 
When the shock was increased to 1.6 mA, responding 
continued only long enough for the first reinforce- 
ment to be obtained (i.e., for 5 sec). With a shock 
setting of 3.0 mA for .5 sec, all responding was sup- 
pressed immediately the preshock stimulus was pre- 
sented, even though a reinforcement would become 
available only 5 sec later. A study by Lyon and Millar 
(1969) also suggests that the imminence of reinforce- 
ment in an FI schedule may attenuate conditioned 
suppression. In the interreinforcement intervals of an 
FI 2-min schedule, a preshock stimulus of 30 sec was 
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presented 30, 60, or 90 sec after the preceding rein- 
forcement. It was found that there was marked sup- 
pression of responding during the stimulus when it 
was presented early in the interval, but no suppression 
when the stimulus occurred late in the interval. 
Preshock stimuli have also been superimposed on 
behavior maintained by a schedule which differen- 
tially reinforces a very low rate of responding (DRL). 
In some circumstances it has been shown that re- 
sponding on this schedule increases in frequency 
during a preshock stimulus. For example, Blackman 
(1968a) exposed rats to a multiple schedule, of which 
one component was a DRL 15-sec schedule and the 
other generated higher response rates. In all condi- 
tions of the experiment, this second pattern of be- 
havior was suppressed during a preshock stimulus. 
However, when the stimulus ended with a relatively 
mild shock (.2 mA for .5 or 1.0 sec), the DRL behavior 
increased in frequency during its presentation, al- 
though with higher intensities of shock the more 
usual suppressive effect was found. The acceleration 
of DRL responding during a stimulus which ends 
with a mild shock is illustrated in Figure 4, which 
shows the cumulative records of three rats exposed 
throughout each experimental session to a DRL 15- 
sec schedule. A tentative suggestion has been made 
(Blackman, 1968a) that the acceleration effect on the 
DRL responding was attributable to a disruption of 
the collateral behavior which appeared to mediate the 
lever-pressing behavior. ‘These stereotyped sequences 
of behavior were not formally measured in the ex- 
periment, but they characterized the DRL behavior in 
control conditions. During the preshock stimulus, 
however, the collateral behaviors seemed to be quickly 
disrupted, and lever pressing then occurred in their 
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Fig. 4. The accelerative effects of a preshock stimulus on re- 
sponding maintained by a schedule which differentially rein- 
forces a low rate of lever pressing (DRL 15 sec). The response 
pen is offset during the preshock stimulus; hatchmarks denote 
reinforcement. (Unpublished data of Sanger & Blackman.) 
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absence and at a higher frequency than in control 
conditions. 

In an experiment which employed a two-lever situ- 
ation (Blackman, 1970a), rats were exposed to a sched- 
ule in which a response on lever B was followed by 
reinforcement if a preceding response on lever A had 
been made at least 5, 10, or 15 sec before. When a 
preshock stimulus was superimposed on the behavior 
generated by this schedule, it was found that the fre- 
quency of timing attempts, i.e., of A-to-B sequences, 
decreased during the stimulus. This had also been 
found in a similar experiment by Migler and Brady 
(1964). When the delay required was 5 sec, there was 
no change in the distribution of A-to-B times during 
the preshock stimulus. Thus although the frequency 
of timing attempts decreased, their accuracy was not 
impaired, again replicating a finding by Migler and 
Brady (1964). However, when the required A-to-B 
delay was 10 or 15 sec, the distribution of A-to-B times 
changed during the preshock stimulus, there being 
more shorter intervals. Also noticeable (especially with 
the 15-sec-delay requirement) was an increased pro- 
portion of inappropriate B responses, i.e., B responses 
which were made without a preceding A response to 
initiate a timing attempt. The disruption of timing 
efficiency and the increase in appropriate B responses 
led in one case to an overall acceleration of B re- 
sponses in comparison with control conditions, in 
spite of the decreased frequency of appropriate timing 
attempts. his accelerative effect may be analogous to 
the acceleration reported with a single-lever DRL 
schedule. A generally similar effect was reported by 
Blackman and Scruton (1973a), who superimposed a 
preshock stimulus on a two-lever counting schedule. 
In this case, rats were required to make at least a 
specified number of successive responses on lever A 
betore switching to lever B to produce reinforcement, 
and there was a shift to shorter sequences of responses 
on lever A during the preshock stimulus. 

Hearst (1965) has reported a deterioration in dis- 
criminative control as a result of the Estes-Skinner 
procedure. He superimposed a preshock stimulus on 
periods of intermittent reinforcement, and found that 
the operant behavior was suppressed in the usual way 
during this stimulus. However, the rats were also 
exposed to periods when no reinforcement was possi- 
ble (extinction). ‘The preshock stimulus never occurred 
during periods of extinction, but the deterioration in 
discriminative contro] reported by Hearst took the 
form of an increase in responding during these peri- 
ods. Hearst related this finding to the Pavlovian con- 
cept of disinhibition. However, it should perhaps be 
mentioned that two subsequent experiments have 
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failed to replicate his findings (Blackman & Scruton, 
1973b; Weiss, 1968). In the former case, there was no 
increase in responding during periods of extinction, 
even when the preshock stimulus was subsequently 
presented during extinction periods as well as during 
the periods of intermittent reinforcement. The rea- 
sons for these potentially important inconsistencies 
between experiments are not clear as yet. 

It is clear that the amount of suppression produced 
by a given preshock stimulus depends crucially on the 
nature of the schedule which maintains the operant 
behavior on which it is superimposed. Indeed, in some 
circumstances (albeit limited to fairly stringent timing 
schedules) a preshock stimulus will lead to an in- 
crease in the rate of food-reinforced operant respond- 
ing. Certain familiar schedules of reinforcement, par- 
ticularly variable-interval, provide a base line of 
behavior against which the effects of classical condi- 
tioning parameters can be readily assessed. However, 
it as also the case that, when classical conditioning 
procedures are held constant, substantial differences 
in the effects of 9 conditioned stimulus may emerge 
as a function of the precise schedules of reinforcement 
which maintain the operant behavior, differences not 
only of degree but even on occasion of direction. 


MEASUREMENT OF CONDITIONED 
SUPPRESSION 


We have already noted that most workers have 
measured conditioned suppression by comparing the 
response rate during the preshock stimulus with the 
control rate of responding, ic., in the absence of 
the preshock stimulus. We may continue to regard the 
different formulae which have been used to make such 
a comparison as no more than tedious and potentially 
confusing, However, Hoffman (1969) and Millenson 
and de Villiers (1972) have suggested that the use of 
relative measures has not been adequately justified. A 
relative measure may make it easy to compare condi- 
tioned suppression of different patterns of operant be- 
havior, but this entails an arbitrary assumption: 
“Measurement by relative suppression presupposes 
that under constant experimental conditions the warn- 
ing signal will produce the same relative decrement 
independent of the rate of the responding at the mo- 
ment of the warning signal presentation” (Hoffman, 
1969, p. 68). There is now ample evidence that this 
assumption is false, as shown in the previous section. 
Suppression ratios obtained in experiments in which 
the same preshock stimulus is superimposed on differ- 
ent rates of operant behavior are not identical (Black- 


man, 1966, 1967, 1968b). Such a finding offers the 
investigator two very different interpretations (Black- 
man, 1972): 

1. A suppression ratio may always reflect ac- 
curately the strength of a classically conditioned re- 
sponse elicited by the conditioned stimulus. In other 
words, the more severe the disruption of operant be- 
havior (as expressed by a suppression ratio), the greater 
is the strength of the CR. If this is true, different 
strengths of CR are developed by a uniform pro- 
cedure when it is superimposed on different operant 
response rates. Why this should be so remains un- 
explained, although it has often been suggested in a 
more general context that the effects of any indepen- 
dent variable depend on the nature of ongoing be- 
havior as controlled by schedules of reinforcement 
(e.g., Dews, 1963), 

2. fi standard classical conditioning procedure may 
always result in a CR of uniform strength. Different 
suppression ratios describing the effects of a preshock 
stimulus on different patterns of operant behavior 
may then result from the fact that this uniform con- 
ditioned response interacts differently with these 
patterns of behavior. According to such a view, the sup- 
pression ratio is therefore not an uncentaminated 
reflection of the strength of a CR. 

It is dificult to decide between these alternatives, 
and initial preferences may reflect whether one’s basic 
allegiance is to the study of classical conditioning or 
of operant conditioning processes. Given this problem 
of interpreting suppression ratios, however, it would 
seem only prudent to support inferences based on 
these ratios by absolute data whenever possible. Thus 
in the case discussed in the previous section, suppres- 
sion ratios suggested that high response rates were 
more disrupted by a preshock stimulus than were 
lower rates of responding, Corroborative evidence was 
provided by the absolute data, which showed that the 
stimulus was accompanied by fewer responses (in abso- 
lute terms) when superimposed on normally high re- 
sponse rates than when superimposed on normally 
lower rates. These two forms of data suggested that a 
differential effect is attributable to the patterns of 
operant behavior. 

In many cases of interactions between a classical 
conditioning procedure and different patterns of op- 
erant behavior, the dilemma outlined above can be 
regarded as unimportant, for research interest may 
focus principally on the details of schedule control 
and its disruption. The matter can be of great prac- 
tical significance, however, as may be illustrated by an 
example. Figure 5 shows data obtained from one rat 
which was exposed to a FR 10 schedule of food rein- 
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Fig. 5. The effects of various dosages of chlordiazepoxide on 
conditioned suppression of a rat’s responding on a FR 10 sched- 
ule. Complete suppression =0; no effect = 1.0. (Unpublished 
data of D. Sanger.) 


forcement. When this behavior had stabilized, a con- 
ventional conditioned suppression procedure was in- 
troduced, the details of which are not important here. 
The rat was then tested after various injections of the 
minor tranquilizer chlordiazepoxide. Figure 5 plots 
the suppression ratios which were obtained with the 
various doses, these being calculated by Stein, Sidrnan, 
and Brady’s (1958) formula: SR = CS rate/Control 
rate. It seems clear that increasing doses of the drug 
have an increasingly attenuating effect on conditioned 
suppression. However, with a relative measure such as 
this, these changes may be produced either by changes 
in response rate during the CS or by changes in its 
absence. Figure 6 shows the absolute response rates in 
these two periods after each dosage of the drug. ‘The 
open points display the drug’s effects on response 
rates during the preshock stimulus (FR 10/CS), and 
the closed points show response rates in its absence. 
Clearly, the effects of the drug are by no means as 
simple as at first they may have seemed on the basis 
of the suppression ratios in Figure 5. In fact, the ra- 
tios change as a result of changes in response rates 
both within the CS and in control conditions. With 
lower dosages, increases occur in both response rates. 
The differences between the effects of 8 mg/kg and 
16 mg/kg, however, can be seen to be almost entirely 
due to differences in control rates, with little change 
in CS rates. ‘Ihe dilemma outlined earlier therefore 
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appears: does the orderly effect of increasing dosage 
on suppression ratios reflect (1) progressive reduc- 
tions in the strength of the underlying conditioned 
response, or (2) merely contaminations produced by a 
changing control base line, with the strength of the 
conditioned response remaining unchanged? Although 
it may be difficult to answer such a question, it 1s 
surely clear that informed interpretation of the drug’s 
effects demands absolute as well as relative data. Nev- 
ertheless, experimental reports continue to emphasize 
suppression ratios and frequently fail to supplement 
these by absolute rates in the presence and absence of 
a conditioned stimulus. 

A further complication arises whenever any simple 
measure of conditioned suppression is reported, 
whether this be in the form of relative or of absolute 
response rates during a preshock stimulus. We have 
noted previously that a stimulus which ends with a 
shock is a conditioned stimulus within the Pavlovian 
delayed conditioning paradigm. With traditional Pav- 
lovian procedures, some form of temporal discrimina- 
tion usually develops within such a CS, the conditioned 
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Fig. 6. The effects of chlordiazepoxide on conditioned suppres- 
sion. The data are those which are expressed in the form of 
suppression ratios in Figure 7, but are here plotted in absolute 
terms, i.e., aS response rates during the preshock stimulus (FR 
10/CS) and rates in its absence (FR 10). (Unpublished data of 
D. Sanger.) 
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response eventually being characteristically elicited 
only toward the end of the CS (i.e., just before the 
US). A similar temporal discrimination has sometimes 
been reported with conditioned suppression. For ex- 
ample, Hendry and Van Toller (1965) reported that 
initial sustained suppression throughout a preshock 
stimulus was superseded by a pattern in which the 
suppression occurred only in the second half of the 
stimulus. In some cases, response rates during the first 
half of the stimulus in fact increased in comparison 
with control conditions. However, such temporal pat- 
terning has not been reported consistently in the 
literature, even in studies in which a preshock stim- 
ulus has been superimposed on operant behavior re- 
peatedly. For example, Stein et al. (1958) remark that 
the type of response patterning within a fixed pre- 
shock stimulus was “not necessarily invariant from 
one stimulus presentation to the next.’ This is a 
further example of an inconsistency in the literature 
which has not yet been resolved. The development of 
temporal patterning may depend on a number of 
variables. such as the relative duration of the preshock 
stimulus, the intensity of the shock, the number of 
conditioning trials. and the nature of the schedule 
which maintains the operant behavior. 

Unrecorded temporal patterning within a preshock 
stimulus could have a considerable contaminatin 
effect on reported results. Millenson and Leslie (1974), 
for example, argue that a drug which appears to 
alleviate or enhance conditioned suppression might 
do so principally by affecting the nature of any such 
temporal discrimination, There would appear to be 
two ways of counteracting this possible contamina- 
tion. The first is to vary the duration of the preshock 
stimulus from trial to trial, although still ending each 
stimulus presentation with a shock. Millenson and 
Hendry (1967) found that such a procedure did result 
in consistently suppressed responding during the stim- 
ulus. An alternative expedient is to use 2 conditioned 
stimulus of fixed duration in the usual way, but to 
deliver shocks at unpredictable moments throughout 
the stimulus and not merely as the stimulus ends. 
This procedure has been used occasionally. For exam- 
ple, Azrin (1956) included it (termed by him “VI 
uncorrelated shock”), and his cumulative records re- 
veal consistent suppression throughout the stimulus 
associated with shock. More recently, Bond, Black- 
man, and Scruton (1973, Experiment 2) used the pro- 
cedure. In this experiment the response rates were not 
entirely consistent throughout the stimulus associ- 
ated with shock, but the inconsistencies may have re- 
sulted from the suppressive effects of the procedure on 
adjunctive licking which had reliably developed in 


this experiment: certainly, there was no evidence of 
an orderly temporal patterning during the stimulus. 

Of course, the delivery of shocks at unpredictable 
moments during a stimulus is strongly reminiscent of 
Rescorla’s procedures reviewed earlier, although in 
these studies shocks and another stimulus were as- 
sociated only before operant conditioning occurred, 
and only the CS was subsequently superimposed on 
operant behavior. Nevertheless, a complication even 
in presenting shocks at random moments during a 
stimulus emerges from one of Rescorla’s studies 
(1968). He found that response rates were not con- 
sistently suppressed even when the conditioned stim- 
ulus was superimposed on variable interval behavior. 
There was greater suppression during the initial parts 
of the stimulus, with less in the later parts (i.e., the 
opposite of Hendry & Van Toller’s 1965 results using 
a conventional preshock stimulus). Rescorla suggested 
that this effect may reflect the fact that the onset of a 
G5 is more discriminable than its continued presence. 
A second possibility mentioned by Rescorla, however, 
brings us full circle, for he suggests that his differen- 
tial suppression within a CS may be 


an artifact of the measuring technique. A VI 
schedule of reinforcement is such that the longer 
[a subject] has refrained from pressing, the 
higher the probability that its next press will 
be reinforced. Thus the longer [the subject] 
suppresses, the more “pressure” the base-line 
operant schedule places on it to respond, (Res- 
corla, 1968, p. 5) 


Rescorla therefore goes on to suggest that the strength 
of the classically conditioned response may be con. 
stant throughout the CS: only the tendency to emit an 
operant response changes. This suggestion is clearly 
based on the second interpretation of conditioned sup- 
pression discussed toward the beginning of this section. 

The measurement of conditioned suppression is 
fraught with difficulties, some of which pose interest- 
ing dilemmas. There is scope for ambiguity even 
when responding is totally suppressed during a pre- 
shock stimulus. For example, Lyon (1965) claimed 
that a change in base line response rate is not suffi- 
cient to change the amount of conditioned suppres- 
sion. However, in the first phase of his experiment, 
Lyon used a procedure which resulted in complete 
conditioned suppression, and he then found that com- 
plete suppression still occurred when the base line 
response rate was increased. Subsequent research (erg. 
Blackman, 1968b) has shown that increases in base 
line response rate lead to an increase in the amount 
of conditioned suppression. Since this effect could not 
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be shown in Lyon’s experiment because of a “ceiling 
effect,” he was therefore led to a general conclusion 
which was false. Problems of measurement in studies 
of conditioned suppression must therefore be borne 
in mind constantly in this area of research. Sometimes 
a simple suppression ratio in one sustained experi- 
mental situation may not be the most useful measure. 
For example, Fleshler and Hoffman (1961) investi- 
gated the stimulus generalization of conditioned sup- 
pression with pigeons. First, complete conditioned 
suppression was obtained during a 1000-Hz tone which 
preceded a shock. Then tones of different frequencies 
were presented in a generalization test in extinction 
conditions (i.e., no tone ended with shock). At first, the 
stimulus generalization gradient, which was measured 
by suppression ratios, was flat, there being almost 
complete conditioned suppression during all the tones. 
However, as testing proceeded, the gradient sharp- 
ened, the suppression ratios during the tones which 
were most different from the previous CS showing 
that these stimuli were the first to lose their control 
over behavior. Thus the flat gradient first obtained 
did not reflect uniform effects of the different test 
stimuli on behavior, and Fleshler and Hoffman's ex- 
tinction procedure made it possible to identify their 
different degrees of behavioral control in spite of an 
initial “ceiling effect.” 

Although no simple measure of conditioned sup- 
pression is entirely satisfactory, many problems of in- 
terpretation may be overcome by using measures of 
absolute response rates during a preshock stimulus as 
well as the relative measures provided by suppression 
ratios, and in some circumstances by using repeated 
tests in changing conditions (as in Fleshler & Hoff- 
man’s 1961 experiment). The most important conclu- 
sion to be prompted, however, is that the most appro- 
priate measure of conditioned suppression in any 
experiment should always be considered carefully. 


SOME INTERPRETATIONS OF 
CONDITIONED SUPPRESSION 


Why is positively reinforced operant behavior usu- 
ally suppressed during a stimulus which is associated 
with shock? ‘Three major explanations for this phe- 
nomenon will be considered here: operant behavior is 
suppressed because (1) other behaviors resulting from 
the procedure interfere with it; (2) the procedure gen- 
erates an emotional state which affects the underlying 
motivational state of the subject; (3) the procedure 
allows for occasional adventitious punishment of the 
operant behavior. It is not always easy to keep these 
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three accounts separate, and any attempt to do so 
results in some arbitrary decisions. The discussion 
continues to be confined to the effects of a preshock 
stimulus on food-reinforced behavior. The extension 
of the theories to other examples of classical condi- 
tioning effects on operant base lines will be consid- 
ered subsequently. 


The Interference Hypothesis 


The possibility that other behaviors interfere with 
ongoing operant behavior to produce conditioned 
suppression has been suggested in terms both of com- 
peting respondent behavior and of competing op- 
erants, although the former has received by far the 
more attention. 

We have seen in a preceding section of this chapter 
that conventional Pavlovian conditioning parameters 
such as the intensity of the CS and the US determine 
the severity of conditioned suppression, so that the 
phenomenon has been frequently studied as an ex- 
ample of Pavlovian conditioning. Kamin (1965) has 
expressed a widely held view that “the most obvious 
assumption has been that the interference with be- 
havior, which serves as our measure, is largely the 
result of incompatibility between respondents elicited 
by S, [the pre-shock stimulus] and the ongoing be- 
havior.” 

The empirical status of this interfcring respondents 
hypothesis is open to some doubt. First, it is necessary 
to specify the behaviors said to be conditioned during 
the preshock stimulus which are supposed to inter- 
fere with the operant behavior. Second, it remains 
necessary to show why and how any such behaviors 
are incompatible with the emission of an operant 
such as pressing a lever. There are several obvious 
contenders in answer to the first of these questions, 
but surprisingly little systematic work has been car- 
ried out in an attempt to monitor changes in auto- 
nomic activities to see if their intensities vary with the 
amount of suppression of operant behavior. On a 
gross level, traditional signs of autonomic activity 
such as defecation, urination, piloerection, and freez- 
ing of motor activity have frequently been discussed 
in the context of conditioned suppression. In an early 
experimental program by Brady and his associates 
(see, for example, Brady, 1951) the effects of a pre- 
shock stimulus were measured either in terms of the 
suppression of operant behavior with one group of 
animals or in terms of gross changes in these auto- 
nomic activities. Similarly, Hunt and Brady (1955) 
commented on such activity during a stimulus which 
precedes an unavoidable shock. There seems little 
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doubt that signs of autonomic activity such as these do 
characteristically accompany the early stages of many 
conditioned suppression experiments. However, Mil- 
lenson and de Villiers (1972) have suggested that these 
signs seem to decrease with continued testing, al- 
though their comment is based on informal observa- 
tions which deserve to be quantified systematically. 
Certainly in later stages of experiments suppression 
of operant behavior does appear to persist when gross 
signs of autonomic arousal are minimal or nonexis- 
tent. 

There have been many studies of other more con- 
strained respondent changes resulting from the de- 
livery of a preshock stimulus (see, for example, the 
review by Weiskrantz, 1968), Two experimental pro- 
grams which have related such changes to simultane- 
ous suppression of operant behavior are particularly 
interesting. In the first of these (de Toledo & Black, 
1966) the heart rates of rats were recorded, It was 
found that changes in heart rate did occur during the 
preshock stimulus, but they developed legs quickly 
than did the suppression of operant responding. 
Moreover, the changes in heart rate were much more 
variable and of shorter duration than the operant 
suppression, This finding has been supported in a 
study by Brady, Kelly, and Plumlee (1969), in which 
the heart rate and blood pressure (both systolic and 
diastolic) of rhesus monkeys were monitored through- 
out the development and maintenance of conditioned 
suppression. During the preshock stimulus, there were 
certainly changes in these autonomic indicators. 
Again, however, suppression of the operant behavior 
developed before any detectable and reliable changes 
in heart rate and betore changes in blood pressure. It 
was impossible to identify any consistently similar 
variations in the dependent variables in this study: 
with one subject changes in heart rate even appeared 
to be inversely related to the amount of conditioned 
suppression of operant behavior. On frequent occa- 
sions the two measures of blood pressure showed di- 
vergent patterns of conditioned reactions. The results 
of this experiment are illustrated by data from one 
subject in Figure 7. ‘This shows the percentage changes 
in each of the four behavioral measures, expressed for 
each of the successive minutes of the 3-min preshock 
stimulus. Selected conditioning trials are shown. Reli- 
able suppression of operant behavior developed be- 
fore any consistent disruption in autonomic activities. 
The lack of consistent covariation between the mea- 
sures can also be seen. This monkey also shows the 
development of a temporal discrimination in the con- 
ditioned suppression of operant behavior, as discussed 
earlier. ‘his began to develop by the 16th trial, and 


eventually took the form of only slight suppression in 
the first minute of the stimulus, with almost total 
suppression in the second and third minutes. How- 
ever, Measures of autonomic activity fail to show this 
biphasic pattern. It is also worth noticing that on 
some trials (e.g., trials 18 and 31 of those shown) lever 
pressing occurred more frequently in the first minute 
of the stimulus than in control conditions—the effect 
reported by Hendry and Van Toller (1965) and dis- 
cussed earlier. Again, there is no characteristic pat- 
terning of autonomic activity which reflects this dis- 
tribution of operant responses during the preshock 
stimulus. 

On the basis of their data, Brady and his associates 
concluded that the operant and autonomic effects of 
their experiment were causally independent, although 
doubtless related in complex ways. In the most gen- 
éral terms, they suggested that their finding “reflects 
unfavorably upon theoretical formulations that em- 
phasize either the causal interdependence of behay- 
ioral and physiological events or the primacy of cither 
one.”’ 
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Fig. 7. Changes in blood pressure, heart rate, and lever-pressing 
response rate of a rhesus monkey during the 3 min of a pre- 
shock stimulus. The zero points represent control values in the 
absence of the preshock stimulus. (From Brady, Kelly, & Plum- 
lee, 1969.) 
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Of course, in the context of the interference hy- 
pothesis of conditioned suppression it would always 
be possible to suggest that experimenters have failed 
to measure those respondents which do have a disrup- 
tive effect on operant behavior. Nevertheless, the evi- 
dence at present points unequivocally to the conclu- 
sion that conditioned respondents may accompany 
but do not cause conditioned suppression. In any case, 
even if some respondent were to be identified which 
varied in direct proportion to variations in operant 
responding, it would be by no means clear why it 
should be physically incompatible with that respond- 
ing, the second necessary step if the interference the- 
ory is to be convincing. Further difficulties for this 
hypothesis are presented by the differential disruption 
of various frequencies and patterns of operant re- 
sponding which was reviewed in an earlier section, for 
it is not obvious why any interfering respondent be- 
havior should be more incompatible with some pat- 
terns and frequencies of operant responding than 
with others. This is, of course, particularly true when 
operant response rates are similar but reinforcement 
frequencies differ. Nevertheless it would certainly be 
of great interest to monitor autonomic changes when 
a preshock stimulus results in differential suppression 
of operant behavior. For example although operant 
and respondent behavior may not be functionally re- 
lated on a 1:1] basis during a preshock stimulus, it 
would be interesting to discover whether the two 
classes of behavior are relatively resistant to disrup- 
tion in the same circumstances (e€.g., in situations 
which generate low operant response rates or with 
high reinforcement frequencies). Experiments such as 
this might have the greatest relevance in the general 
study of the relationships between autonomic proc- 
esses and operant behaviors and of the relationship 
between physiological events and directly observable 
behaviors. 

There remains one pattern of behavior not yet 
fully discussed but whose occurrence during a pre- 
shock stimulus might certainly be physically incom- 
patible with lever pressing. Rats frequently crouch or 
“freeze” when shocked, and such behavior might oc- 
cur during a preshock stimulus. Discussion of this 
possibility has been delayed, since it would be diffi- 
cult to assert that this would necessarily be an exam- 
ple of a competing respondent. Leaving aside the 
question of whether such skeletal behaviors can be 
classically conditioned (see Chapter 3), it is possible 
that they might develop or be maintained as a result 
of their consequences, and hence should be regarded 
as competing operant behavior. In other words, adopt- 
ing certain postures such as “‘freezing”’ might minimize 
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the aversiveness of a shock when it is delivered, as 
Weiskrantz (1968) has suggested. 

“Freezing” behavior during a preshock stimulus has 
been investigated with pigeons by Stein, Hoffman, and 
Stitt (1971). They used ethological recording tech- 
niques to measure general behavior which occurred 
in addition to operant key pecking and found that 
there was a marked decrease in all overt activity (in- 
cluding key pecking) during the stimulus. In this 
experiment it is unlikely that such a general in- 
hibitory effect in behavior was maintained by an 
unprogrammed instrumental contingency, since the 
shock was delivered through wing bands. 

Whether any “freezing” responses during a pre- 
shock stimulus should be regarded as competing re- 
spondents or competing operants, this general inter- 
pretation of conditioned suppression is open to the 
objections discussed above. As with other putative 
competing responses, even if they occur reliably and 
consistently, it is not clear whether they interfere 
with recorded operants and thereby cause their sup- 
pression or are merely a reflection of the same proc 
ess which causes such suppression. 


Motivational Explanations 


Many researchers have suggested that a preshock 
stimulus produces a change in the motivational state 
of a subject, which in turn leads to conditioned sup- 
pression. In recent years, Estes (1969 p. 80) has sug- 
gested that 


a stimulus which has preceded a traumatic 
event, e.g., shock, . . . acquires the capacity of 
inhibiting the input of amplifier elements from 
sources associated with hunger, thirst and the 
like. If then, while the animal is performing an 
instrumental response for, say, food reward, this 
conditioned stimulus is presented, the facilita- 
tive drive input will be reduced and so also the 
probability or rate of the instrumental response. 


A preshock stimulus may therefore be said to produce 
anxiety, which can be regarded as a motivational force 
which reduces positive motivation for reinforcement 
and thereby decreases the frequency of operant be- 
havior. 

A similar argument may also be developed from 
the description of conditioned suppression as result- 
ing from a “conditioned emotional response” (CER) . 
Thus Hunt and Brady (1951) hypothesized “an inter- 
nal state underlying the behavioral reaction,” and 
Kamin also used the term “CER” frequently (e.g., 
1965). However, it is not consistently clear whether 
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either Hunt and Brady or Kamin wish to emphasize 
the adjective emotional sufficiently to demand that 
their theories be considered under the present head- 
ing rather than the previous one; for Kamin, at least, 
has consistently conducted research which could be 
said rather to emphasize the “conditioned .. . re- 
sponse” as the behavioral outcome of Pavlovian con- 
ditioning rather than as a motivational state. In their 
interesting paper, Millenson and de Villiers (1972) 
suggest that this “failure to consider that the CER is 
an emotional phenomenon” has been a barrier to the 
adequate understanding of the effects we have been 
discussing, ‘I’hese writers seek to develop Skinner’s 
(1938) statement that emotion is ‘‘a state of strength 
comparable in many respects with a drive” (p. 407) 
and 6 argue that conditionéd suppression results 
trom “a negative drive activity,” a vicw similar to that 
of Estes. Thus “when the signal for shock [is] pre- 
sented the rat's hunger might be temporarily sus- 
pended and ‘suppression’ is the natural consequence 
of food (as well as all other positive reinforcers) hay- 
ung temporarily lost its reward value” (Millenson, 
1971, p. 229). 

A motivational decrement theory such as this di- 
rects research attention to questions somewhat differ- 
ent from those discussed so far. Clearly, a stimulus 
which precedes a shock of a given intensity will have 
a greater suppressive effect, according to this theory, 
if it is Superimposed on behavior which is relatively 
weakly mdtivatéd. In this context, Millenson and de 
Villiers (1972) discuss experiments in which they 
varied the deprivation conditions for subjects ex- 
posed to a preshock stimulus. The résults for one of 
these are show in Figure Q. Groups ef rats were cx- 
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posed to a random-interval 60-sec schedule of food 
reinforcement in two conditions on each day: first 
when 9 hr food-deprived (prefeeding) and then after 
being given 8-15 g of free food (postfeeding). In both 
conditions, a stimulus of variable length (Millenson & 
Hendry, 1967) ended with an unavoidable shock. The 
panel on the left in Figure 8 shows that mean rates of 
responding in the safe periods (i.e., in the absence of 
the preshock stimulus) were consistently higher in the 
prefeeding condition that in the postfeeding condi- 
tion. ‘The preshock stimulus suppressed both these 
patterns of behavior, the effect being dependent on 
the intensity of the shock delivered in the various 
phases of the experiment. The absolute decrease in 
response rate was greater in the prefeeding condition. 
However, the panel on the right in Figure 8 shows 
that in terms of a suppression ratio (CS /Control rate), 
the posttecding condition appears to show the greater 
rclative suppression at all shock intensities, the effect 
being clearer at .2 and ,1 mA, where it is less con- 
taminated by a cciling effect produced by severe 
disruption of behavior. ‘The data of this study are pre- 
sented in terms of both absolute and relative response 
rates, and it can therefore be seen that the lower 
control response rates (postfeeding) are the more sup- 
pressed in terms of suppression ratios. Since, with pac- 
ing procedures in which the deprivation conditions 
are held constant, lower rates of responding are the 
less disrupted (Blackman, 1968b), it seems reasonable 
to conclude with Millenson and de Villiers that the 
suppressive effects of their preshock stimulus are 
rélated directly to the value of the reinforcers. Thus 
“emotion” has a greater disruptive effect on behavior 
which is less strongly motivated. 


Fig. 8. The effects of a preshock 
stimulus on random-interval be- 
havior in rats. The rats were 
tested when 9 hr food-deprived 
(prefeeding) and shortly after 
8—15 g of free food (postfeed- 
ing). On the left are shown re- 
sponse rates during the CS and 
in its absence (safe) in both 
conditions. On the right are 
shown the resulting suppression 
ratios. (From Millenson & de 
Villiers, 1972.) 
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Millenson and de Villiers (1972) also reported an 
interesting change in relative preference when a pre- 
shock stimulus is superimposed on behaviors which 
are maintained by a concurrent schedule of reinforce- 
ment. Rats were exposed to a situation in which they 
could always press one lever for occasional access to 
1.5-sec milk reinforcement or press another lever for 
access to the same reinforcer for 4.5 sec. In control 
conditions, an asymmetric performance was observed, 
rats showing some preference for the second lever. 
During a preshock stimulus which was superimposed 
on this concurrent schedule, the preference was en- 
hanced, there being more suppression (in relative 
terms) of the responding on the 1.5-sec lever than on 
the 4.5-sec lever. This is further support (it is argued) 
for a drive decrement theory of conditioned suppres- 
sion, since the increase in preference results from the 
ereater suppressive effect of a conditioned emotional 
response on the less motivated behavior. 

Motivational theories suggest research which might 
not arise from other conceptual schemes, Empirical 
data such as those of Millenson and de Villiers are 
therefore welcome and challenging. However, as with 
all such theories, there are potential disadvantages in 
the motivational view of conditioned suppression with 
its appeal to states which cannot be measured di- 
rectly. For example, the differential effects of a pre- 
shock stimulus on behavior maintained by various 
schedules (discussed in the section on operant con- 
ditioning parameters, above) may too easily be trans- 
lated into motivational terms in a way which can be 
difficult to refute. If a pattern of behavior proves to 
be susceptible to conditioned suppression, this can be 
taken as evidence that motivation is weak. On the 
other hand, behavior which is resistant to conditioned 
suppression can readily be described as strongly mo- 
tivated. Since behavior which is reinforced frequently 
is less disrupted by a preshock stimulus, motivation 
can be said to vary with reinforcement frequency in a 
way that seems acceptable. Similarly, animals can be 
described as weakly motivated in the postreinforce- 
ment pause on a fixed-ratio schedule, thus handling 
Lyon’s (1964) conditioned suppression data discussed 
earlier. But it has also been argued (Millenson & de 
Villiers, 1972) that because high rates of responding 
are relatively susceptible to conditioned suppression, 
they may be weakly motivated. This view may seem 
initially less plausible. It is true that Fantino (1968) 
has shown that animals prefer situations in which 
they are allowed to obtain reinforcement by respond- 
ing at unpaced rates to situations in which they are 
required to respond at high rates. This might appear 
to be the independent evidence of the strength of 
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motivation generated by different schedules which 1s 
clearly required to support the motivational theory of 
conditioned suppression. However, on this basis Fan- 
tino’s experiment does not suggest a reason why low 
rates of responding are even more resistant to condi- 
tioned suppression than unpaced moderate rates, since 
he found that the latter are preferred in choice situ- 
ations. 

The idea that conditioned suppression results from 
an underlying emotional state has proved attractive 
in psychopharmacology. It has been argued that this 
behavioral manifestation of anxiety or of a condi- 
tioned emotional response may prove useful in the 
analysis of drugs which are presumed to act spe- 
cifically on such states. Hence the effects on condi- 
tioned suppression have been reported ot many drugs 
such as “tranquilizers” and barbiturates which have 
been used in clinical practice in an attempt to allevi- 
ate anxiety states. These reports have recently been 
reviewed by Millenson and Leslie (1974), who point 
out the considerable advantages of the conditioned 
suppression procedure in this context. First, as is the 
case with most operant conditioning experiments, 
experimental sessions may continue for long periods, 
thereby allowing the time course of a drug’s effects to 
be measured. (See also Thompson & Boren, Chapter 18 
of this volume.) Second, by choosing the parameters of 
the conditioned suppression experiment judiciously, 
it is possible to establish partial suppression during a 
preshock stimulus, thereby providing a behavioral 
base line which is sensitive to either alleviating or en- 
hancing effects of a drug on anxiety. Third, and 
perhaps most important, since the procedure includes 
both signaled periods when anxiety is presumed to be 
suppressing behavior and periods of satety from aver- 
sive stimuli, it is possible to provide a within-sessions 
control for any side effects which a drug might have 
on overall motivation, on sensory function, or on loco- 
motor activity. 

An early experiment by Brady (1956) has been 
widely cited as illustrating the potential of this tech- 
nique. Brady, using rats as subjects, established partial 
conditioned suppression of intermittently reinforced 
operant behavior during a preshock stimulus and 
then investigated the effects of amphetamine and of 
reserpine. Brady claimed that both these drugs had 
specific effects “in the affective sphere,” 1e., on the 
conditioned emotional response. ‘Thus amphetamine 
strengthened the CER, for in comparison with saline 
conditions the drug produced a decrease in the num- 
ber of responses emitted during the preshock stimu- 
lus, in spite of what Brady described as a nonspecific 
side effect on the behavioral base line taking the form 
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of an overall increase in control response rates. Simi- 
larly, it was argued that reserpine attenuated the 
CER: despite a nonspecific decrease in overall re- 
sponse rate, the number of responses during the pre- 
shock stimulus was greater than on saline days. 

Unfortunately, subsequent work in this area has 
not consistently produced similarly encouraging data, 
and some signs of gloom have emerged as to the gen- 
eral usefulness of conditioned suppression as a model 
of anxiety in this context (e.g., Davis, 1968; Kelleher & 
Morse, 1964). Failures to produce clear-cut effects 
have led to interestingly different interpretations on 
occasion. ‘Thus Kinnard, Aceto, and Buckley (1962) 
were led to conclude that conditioned suppression is 
not a model of anxiety. On the other hand, Ray (1964) 
concluded from essentially similar results that it is a 
model of anxiety, and that therefore “tranquilizing” 
drugs do not have a specific effect on anxiety. 

It seems lkely that the conditioned suppression 
phenomenon is a simple model of anxiety only on a 
superficial level. We have seen some of its compley- 
ities In preceding sections, and these must surcly com- 
plicate the analysis of any drug’s effects. Thus Appel 
(1963) has shown that a dosage of reserpine which rell- 
ably reduced conditioned suppression when the shock 
intensity was .6 mA failed to produce consistent effects 
when the shock was increased only to 1.0 mA. Simi- 
larly, the response rates during the preshock stimulus 
in Brady’s (1956) experiment may have been changed 
not only by any specific effect of the drugs on the GER 
but also by the “nonspecific” side effects of the drugs: 
for example, the increased overall response rates pro- 
duced by amphetamine may themselves have pro- 
duced the relative decrease in response rates during 
the preshock stimulus (Blackman, 1972). Or again, 
since amphetamine is known to be an anorexic agent, 
the relative susceptibility of behavior to suppression 
after its administration may be the result of a rela- 
tively low motivational state in the subject on those 
days (Millenson & Leslie, 1974). Finally, any attenuat- 
ing or enhancing effects of a drug on the amount of 
conditioned suppression may merely be the outcome 
of a drug’s rate-dependent effects on the two rates of 
responding during and in the absence of the preshock 
stimulus (Wuttke & Kelleher, 1970), 

Despite the complexities of the situation, Millen- 
son and Leslie (1974) have suggested that the effects 
of drugs on conditioned suppression have not been as 
inconsistent as has sometimes been supposed. They 
consider the reported effects of chronic and acute doses 
of various drugs separately and conclude that minor 
tranquilizers (benzodiazepines, barbiturates, and me- 
probamate) have a relatively consistent effect in allevi- 


ating suppression in acute doses; on the other hand, 
it seems that phenothiazines and reserpine alleviate 
suppression fairly consistently in studies in which they 
are administered chronically. 

The final experiment to be considered in this short 
review of drug effects was reported by Miczek (1973) 
and suggests that an emotional substrate of condi- 
tioned suppression may indeed be specifically affected 
by some drugs. Miczek reports that chlordiazepoxide 
alleviates conditioned suppression of behavior main- 
tained by a VI schedule. His report presents data 
(shown on the right in Figure 9) in terms both of 
suppression ratios and of base line response rates fol- 
lowing various injections of the drug, and in this 
case it seems that the dose-related alleviation of con- 
ditioned suppression is not contaminated in any gross 
way by any changes in behavioral base lines. Even 
more important evidence, however, is to be found in 
the effects of the drug administered to other animals 
exposed to a slightly different situation. These rats 
were also trained on a VI schedule. In their case, 
however, a stimulus was superimposed which ended 
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Fig. 9. The effects of chlordiazepoxide on suppression during a 
stimulus which precedes food (left) or shock (right). Suppression 
ratios are shown at the top and base line response rates below. 
Notice that the schedules of dosages are not identical in the 
two conditions. (Redrawn from Miczek, 1973.) 
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not with a shock but with the delivery of free food. 
Operant behavior was suppressed during this stimu- 
lus in much the same way as occurs during a preshock 
stimulus (this finding of “positive conditioned sup- 
pression” is reviewed in the next section of this 
chapter). Miczek reports, however, that this suppres- 
sion of operant behavior was not attenuated by in- 
jections of chlordiazepoxide (see the left panels in 
Figure 9). These results suggest the drug has a specific 
effect on anxiety, rather than simply exerting differ- 
ent effects on different rates of responding regardless 
of the nature of the US signaled by the CS. 

Studies of drug efiects on conditioned suppression 
usually attempt to identify specific effects on the emo- 
tional states which are thought to produce the sup- 
pression. It may be recognized that these rather vaguely 
defined emotional states may be regarded as condi- 
tioned responses and might therefore have been dis- 
cussed in the context of the interference hypothesis 
—Le€., emotion (anxiety) is a classically conditioned re- 
sponse which disrupts ongoing operant behavior. How- 
ever, most drug studies do not attempt to identify the 
interfering conditioned emotional response per s¢, 
and that is why they have here been discussed in the 
context of a motivational theory. 


The Punishment Hypothesis 


In the conditioned suppression procedure a stimu- 
lus is superimposed on ongoing behavior and ends 
with a shock which is delivered regardless of what the 
subject does. Punishment, however, is defined as the 
“reduction of the future probability of a specific re- 
sponse as a result of the zmmediate delivery of a stim- 
ulus” (such as a shock) after that response (Azrin & 
Holz, 1966, emphasis added). It has been suggested 
that there are no fundamental differences between the 
processes that lead to these two forms of suppression. 
On the one hand, it has been argued (e.g., by Estes, 
1944) that responding which is explicitly followed by 
shock is suppressed by the process outlined above in 
the motivational theory of conditioned suppression. 
Thus the shock is associated with certain external 
cues: these become conditioned stimuli by a Pavlovian 
procedure, and so behavior is suppressed as a result 
of a conditioned emotional response. This account 
has few advocates today as a theory of punishment 
(Azrin & Holz, 1966). However, the opposite theory 
has also been presented—that conditioned suppression 
results from an occasional chance contiguity between 
operant behavior and the delivery of a shock. This 
theory has been discussed at some length by Lyon 
(1968), and since there have been relatively few recent 
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experiments explicitly designed to test it, will be dealt 
with only briefly here. 

Clearly, shocks delivered “independently” of be- 
havior may occasionally be associated with behavior 
in this way. Gottwald (1967) has shown that the 
amount of conditioned suppression on any trial is 
affected by the proximity of shock to a response on the 
previous trial. However, there are a number of rea- 
sons to doubt that conditioned suppression is princi- 
pally caused by adventitious punishment. One of the 
most important of these is represented in Rescorla’s 
work reviewed earlier. In his experiments, the associa- 
tion of a stimulus with shock is accomplished “off the 
base line,” i.e., before operant training is begun. Sub- 
sequently only the conditioned stimulus is superim- 
posed on operant behavior, and there is therefore 
no opportunity for any adventitious contiguity be- 
tween shock and response, Yet, of course, conditioned 
suppression does occur during the CS in these experi: 
ments. Hoffman and Barrett (1971), using observa- 
tional techniques and initial association of a stimulus 
with shock “off the base line,” have also failed to sup- 
pert the punishment hypothesis with pigeon subjects, 
since again conditioned suppression developed when 
possible contiguity between shock and responding was 
minimized. ‘There is also a good déal of evidence that 
the development of conditioned suppression may be 
accompanied at a gross level by more signs of auto- 
nomic disturbance than is punished behavior (Hunt & 
Brady, 1955). In addition, Annau and Kamin (1961) 
have claimed that a shock of .28 mA was sufficient in 
their experiment to suppress behavior when used in a 
punishment procedure, but not when used in the 
conditioned suppression procedure. Orme-Johnson 
and Yarczower (1974) used a yoking procedure, in 
which pigeons were exposed either to a discriminated 
punishment procedure or to one which delivered the 
same number of shocks independently of behavior. 
They reported that the latter procedure produced 
suppression while the former produced none. More- 
over, the stimulus associated with shock in the con- 
ditioned suppression procedure acquired conditioned 
punishing effects, while the discriminative stimulus 
for the punishment procedure did not. 

Lyon (1968) has argued that “punishment and con- 
ditioned suppression do not represent a behavioral 
dichotomy but specific points on a behavioral con- 
tinuum,” a suggestion that it is difficult to refute 
unequivocally. Differences between the effects of the 
two procedures may always be interpreted in such a 
way. However, considerable procedural differences be- 
tween punishment and conditioned suppression in- 
evitably present difficulties in comparing them, not 
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least because suppression of responding has a conse- 
quence in reducing shock frequency in the former 
case but not the latter. It therefore seems unrewarding 
to try to reduce either one to the other, and this is 
perhaps why there is little current research with this 
emphasis. 


A BRIEF REVIEW OF SOME OTHER 
CLASSICAL-OPERANT INTERACTIONS 


The problems and questions arising from the study 
of interactions between classical and operant condi- 
tioning have been illustrated so far exclusively by 
studies in which stimuli associated with shock have 
been superimposed on operant behayior maintained 
by food or water reinforcement. We now turn to con- 
sider briefly some other procedures, 

For some time, motivational theorics of condi- 
tioned suppression gained considerable support from 
their apparent ability to handle data describing the 
effects of signaling an unavoidable shock when an 
animal’s operant behavior is maintained by a schedule 
of shock avoidance. For example, Rescorla and Solo- 
mon (1967) have argued that the laws of Pavlovian 
conditioning are “the laws of emotional conditioning 
or laws of acquired drive states’ and that “condi- 
tioned emotional states change [the subject’s| motiva- 
tion level and thus can serye either as motivators or 
reinforcers of instrumental responses.” They there- 
fore make the general assertion that aversively moti- 
vated operant behavior will increase in frequency dur- 
ing a stimulus which precedes an unavoidable shock, 
since this conditioned emotional state will summate 
with the motivation maintaining avoidance behavior. 
Studies by Sidman, Herrnstein, and Conrad (1957), 
Kelleher, Riddle, and Cook (1963), and Waller and 
Waller (1963) all showed that free operant avoidance 
did increase in frequency in this way during a stim- 
ulus which preceded the delivery of an unavoidable 
shock. However, more recently there has been a num- 
ber of reports of conditioned suppression even of 
avoidance behavior (e.g., Blackman, 1970b; Bryant, 
1972; Hurwitz & Roberts, 1969; Pomerleau, 1970; 
Roberts & Hurwitz, 1970; Scobie, 1972). It seems that 
suppression of avoidance behavior may occur if the 
unavoidable shock is discriminable from avoidable 
shocks (e.g., of a different intensity), or if such sup- 
pression does not increase the overall frequency of 
shocks, either because the avoidance schedule is sus- 
pended during the warning signal or because the re- 
sponse-shock times of the schedule are relatively long 
in comparison with the duration of the signal. At 


present, it is not easy to predict precisely the circum- 
stances in which suppression will be the rule, but it is 
difficult to fit examples such as these into any tradi- 
tional motivational theory. On the other hand, an 
interference hypothesis should in principle be as ca- 
pable of handling suppression of avoidance behavior as 
of coping with suppression of positively motivated be- 
havior. The problem with this theory, however, is that 
it offers little in the explanation of any acceleration of 
avoidance behavior during a conditioned stimulus. 

Rescorla and Solomon (1967) also predicted on the 
basis of their motivational theory that, in their terms, 
any appetitively motivated behavioral base line will 
increase in frequency during an appetitive Pavlovian 
conditioned stimulus. In other words, food-reinforced 
operant behavior should increase in frequency during 
a signal that ends with presentation of free food. We 
have seen already, however (Miczck, 1973), that condi- 
tioned suppression may occur during such a stimulus. 
For example, using rats as subjects Azrin and Hake 
(1969), Meltzer and Brahlek (1970), and Hake and 
Powell (1970) have all reported suppression of re- 
sponding during a stimulus lasting 10 or 12 sec and 
cnding with the presentation of free food. Similarly, 
suppression has been reported with monkeys during 
prefood stimuli (Kelly, 1973a, 1973b; Miczek & Gross- 
man, 1971). It is difficult to see how the Rescorla and 
Solomon (1967) account of classical-operant interac- 
tions can handle such findings. It is intriguing, how- 
ever, to see the vigor with which other theoretical 
accounts of the more traditional conditioned suppres- 
sion during a preshock stimulus have been extended 
to this so-called positive conditioned suppression. In 
their study, Azrin and Hake (1969) used either food or 
water as the reinforcer for operant behavior and de- 
livered “free” food, water, or intracranial stimulation 
at the end of their stimulus. In general, they found 
suppression during the stimulus with all the combina- 
tions of reinforcer and US which they tested. They 
suggested that such suppression was the result of a 
“general emotional state” and argued that suppression 
during a preshock stimulus is another example of the 
effects of such a state, rather than a model of a specific 
anxiety state. Azrin and Hake do not specify the 
nature of this general emotional state, but it would 
seem to be basically similar to Skinner’s concept of 
emotion (1938, p. 407). 

Kelly (1973a) has attempted to monitor any 
changes in autonomic activity in monkeys which 
might be conditioned during a prefood stimulus, with 
a view to evaluating the interfering respondents 
hypothesis in the context of positive conditioned sup- 
pression. Using the same experimental techniques to 
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monitor cardiovascular activity as Brady, Kelly, and 
Plumlee (1969) had employed in their study of pre- 
shock stimuli, Kelly was again unable to detect any 
systematic covariation of changes in autonomic activ- 
ity and operant suppression. He therefore dismisses 
the idea that positive conditioned suppression is 
caused by interfering respondents produced by the 
Pavlovian aspects of the procedure. 

One difficulty in considering a theory of positive 
conditioned suppression in terms of interfering re- 
spondents is that the status of the free food as a Pav- 
lovian unconditioned stimulus is by no means clear. 
In all the studies in this area, except when brain 
stimulation ended a stimulus in Azrin and Hake’s 
(1969) study, the delivery of the “free’’ event seems to 
act more as a discriminative stimulus setting the occa- 
sion for an approach response to a particular part of 
the experimental chamber than as a stimulus which 
unconditionally elicits some response. This observa- 
tion serves to emphasize the possibility that positive 
conditioned suppression might be produced by inter- 
fering operants, a recurring theme in this research 
(Farthing, 1971). Thus it may be that recorded oper- 
ant responding decreases because the subject makes 
preparatory approaches to the food cup which maxi- 
mize the speed of taking up the free food when it 
is delivered, although most reports in this area claim 
that such behaviors could not be detected. Also, 
whether suppression or acceleration of responding 
develops during a prefood stimulus, the possibility 
must be considered that this is superstitiously rein- 
forced by the delivery of the free food—an analogue of 
the punishment hypothesis of the effects of preshock 
stimuli. However, the evidence for superstitious rein- 
forcement in this context is not strong (see Staddon, 
1972). 

The effects of prefood stimuli are being shown in- 
creasingly to depend on the parameters of the situa- 
tion and on the nature of the behavior on which they 
are superimposed. Thus Meltzer and Brahlek (1970) 
reported acceleration of rats’ variable-interval be- 
havior during a 120-sec prefood stimulus, but, as 
noted, suppression during a 12-sec stimulus. Henton 
and Brady (1970) trained monkeys on a DRL 30-sec 
schedule and found no effect of a prefood stimulus of 
20 or 40 sec, but they found acceleration during a pre- 
food stimulus lasting 80 sec. Kelly (1973b) also found 
acceleration of monkeys’ DRL behavior during a 60- 
sec prefood stimulus; his experiment, however, also 
made it possible to compare this effect with that of 
the same prefood stimulus on random ratio behavior. 
This revealed a schedule-dependent effect, for the 
latter behavior was suppressed during the stimulus. 
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Smith (1974), using pigeons, investigated the contribu- 
tion of various response rates and reinforcement fre- 
quencies to the effects of prefood stimuli of various 
lengths. He found that both high and low response 
rates were increased during a 5-sec prefood stimulus. 
With longer stimuli, high rates were suppressed, but 
low response rates were unaffected. With two of the 
three subjects, high response rates were less suppressed 
when they obtained high frequencies of reinforcement 
rather than lower frequencies. It is clear from this 
study that there are considerable similarities in the 
variables which control the amount of suppression 
during a prefood stimulus, as here, and during a pre- 
shock stimulus (e.g., Blackman, 1968b). 

‘There appears to be an important species-depen- 
dent effect when relatively short prefood stimuli are 
used in experiments. Although the above review sug- 
gests that the behavior of rats and monkeys is con- 
sistently suppressed in such conditions, LoLordo 
(1971) has found that pigeons’ response rates increase. 
Similarly, Smith (1974) found increases in various be- 
havioral base lines during a 5-sec prefood stimulus 
with his pigeons. In a recent study, LoLordo, McMil- 
lan, and Riley (1974) have thrown considerable light 
on this anomaly. They found that if the operant re- 
sponse being studied was key pecking, response rates 
increased if the prefood stimulus was a change in the 
illumination of the key. However, if the prefood stim- 
ulus was a nonlocalized tone, there was no accelera- 
tion. Similarly, there were no consistent effects of a 
tone or light prefood stimulus if the operant was 
treadle pressing rather than key pecking. The authors 
interpret these results as suggesting that the accelera- 
tive effect dependent on a localized prefood stimulus 
is an example of an autoshaped and automaintained 
response (Brown & Jenkins, 1968). ‘This suggestion has 
the immediate effect of bringing the discussion toward 
the work of Gamzu and Schwartz (1973), who have 
developed the view that key-pecking rates of pigeons 
may depend on a summation of pecking maintained 
by instrumental contingencies per se and pecking 
which is supported by automaintenance. Since auto- 
shaping and automaintenance have been discussed in 
the context of classical rather than operant condition- 
ing (Jenkins & Moore, 1973), the work of Gamzu and 
Schwartz (1973) and its extension to phenomena such 
as behavioral contrast (e.g., Keller, 1974) is clearly 
relevant to the study of interactions between classical 
and operant conditioning. However, since they are 
discussed elsewhere (Chapter 3), these ideas are not 
developed here. 

It can be seen then that there has been much recent 
work on the effects of prefood stimuli on operant be- 
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havior. Some studies have even reported that such 
stimuli may have suppressive effects on behavior 
maintained by an avoidance schedule (e.g., Davis & 
Kreuter, 1972; Henton, 1972). In general terms, 
studies of the effects of prefood stimuli have developed 
in a similar way to those of preshock stimuli. In both 
cases, the parameters of the procedure and the nature 
of the behavioral base lines on which it is superim- 
posed are crucial, and this makes it impossible to 
make general assertions that a given preevent stimulus 
will have simply suppressive or enhancing effects on 
behavior. There is at present no adequate general 
theory, whether this be couched in terms of a general 
emotional state, conditioned drives, compcting re- 
spondents or operants, or superstitious reinforcement 
of different rates by the delivery of free food. In short, 
rescarch in this area may be said to mirror almost 
exactly the problems which haye heen discussed in 
the context of preshock stimuli throughout this 
chapter. 


CONCLUSION 


The procedures we have considered in this chapter 
haye an apparent simplicity that can obscure the very 
real complexities both of measurement and of inter- 
pretation. In particular, the appropriate measurement 
of the disruption of operant behavior by classical con- 
ditioning procedures poses great problems. There is a 
real danger that describing these effects in terms of a 
relative rate during the conditioned stimulus can ob- 
scure important aspects of the situation. In spite of 
this, we have seen that the measurement of Pavlovian 
conditioning processes through what is usually re- 
garded as their indirect effects on operant behavior 
has been widely recognized as being unusually sensi- 
tive and thereby productive. On the other hand, at- 
tempts to monitor any autonomic effects which might 
be supposed to be derectly conditioned by the Pavlo- 
vian aspects of the procedure have been generally 
disappointing: autonomic changes often do occur dur- 
ing the conditioned stimulus, but they in no sense 
appear to reflect the orderliness of the indirect oper- 
ant effects which one might suppose to be mediated by 
the classical conditioning of autonomic processes. 
Faced with this problem, it has been argued that it is 
a rather ul-specified conditioned emotional response 
(CER) which is the direct outcome of the Pavlovian 
aspects of the conditioning procedure. Some workers 
have preferred to describe this CER as conditioned 
anxiety, a term which has a degree of superficial 
validity in a situation in which a stimulus precedes an 


unavoidable aversive event. However, more recently 
the idea of a general emotional state has been revived, 
of which the traditional conditioned emotional re- 
sponse is said to be but one example. A further theory 
suggests that disruptive effects of a conditioned stim- 
ulus result from the conditioning of a motivational 
state which interacts with the motivation which sup- 
ports the base line operant behavior. Yet a further 
possibility is that disruption of operant behavior dur- 
ing a preevent stimulus is the outcome of poorly con- 
trolled instrumental contingencies and hence reflects 
the strength of other interfering operants or the re- 
sult of adventitious punishment or reinforcement. 
Whether disruptions of operant behavior are 
thought to reHect underlying classical or operant con- 
ditioning effects or the development of changed moti- 
vational states, it is quite clear that the effects of any 
preeyent stimulus depend critically on the nature of 
the behavioral base lines on which they are superim- 
posed. The effects of classical conditioning procedures 
on operant behavior are therefore schedule-depen- 
dent, as are the effects of so many other independent 
variables. The differing degrees of susceptibility to 
disruption by a Pavlovian conditioned stimulus pose 
further questions: do these differences reflect different 
strengths of an underlying conditioned response, or is 
this strength determined solely by CS—US parameters 
so that different degrees of suppression reflect the re- 
sistance to disruption of different patterns of operant 
behavior? Similar problems of interpretation arise 
from the effects of drugs on disruptions of operant 
behavior during a conditioned stimulus and clearly 
return us to the problem of appropriate measurement. 
In spite of the many problems of measurement and 
interpretation which have been discussed in this chap- 
ter, studies of the effects of classical conditioning pro- 
cedures on operant behavior have long played an 
honorable role in learning theory. A problem in re- 
viewing this research, however selectively, is that it 
has been related at various times to many theoretical 
controversies in psychology, and these general issues 
have been mentioned only briefly here. ‘The procedure 
has proved to be successful in providing a sensitive 
dependent variable for the study of the necessary and 
sufficient conditions for the development of an ac- 
quired reflex. However, research using this procedure 
has also provided empirical evidence which has been 
reiated to motivational theories of behavior and the 
role of classical conditioning in motivation, to the 
study of emotion, to the relations between physiolog- 
ical events and overt behavior, to the study of the 
effects of potential anxiolytic agents, and to many 
other important problems. Indeed, perhaps one of the 
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most important features of these studies is that they 
provide a focus for discussion between workers of 
different theoretical persuasions. In this light, it seems 
almost symbolic that the amount of conditioned sup- 
pression in a specified situation is a function of both 
classical and operant conditioning parameters, and 
that this disruption seems at present not to be a direct 
reflection of underlying physiological or autonomic 
processes. This complexity emphasizes that no one ap- 
proach to the problems discussed here can be thought 
of as dominant. Hearst and Jenkins (1975) have 
recently suggested that identifying the different forms 
of learned behavior which develop in specified circum- 
stances is at present preferable to the espousal of any 
universal theory of learning. The results reviewed in 
this chapter support this view. 
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I3 


Negative Reinforcement 


and Avoidance’ 


INTRODUCTION 


This chapter deals with behavior that is main- 
tained when it removes, reduces, or prevents stimula- 
tion. ‘The stimulation is called aversive on the basis 
of this functional relation with behavior. Through the 
same functional relation, the removal or reduction of 
stimulation is defined as negative reinforcement. As 
the title suggests, many of the experiments to be de- 
scribed here were initially designed and interpreted 
as avoidance, with this term taken both as a conve- 
nient category of procedures for shaping and maintain- 
ing behavior, and as a presumably valid category of 
behavior or behavioral processes. However, avoidance 
will not be my organizing principle. The everyday 
meaning of this term is too general to assist analysis. 
Defined more precisely as the prevention, rather than 
the reduction or removal of aversive stimulation, 
avoidance applies to only a few of the procedures and 
data to be included here. Further, within psychology 
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the term has been identified mainly with one theory, 
and with experimental procedures oriented to that 
theory. The history of the interplay between avoid- 
ance theory and experiments has been thoroughly 
documented by Herrnstein (1969) and by Bolles (1973) 
and will not be recounted here. I will, however, sketch 
the theory, initially noting some reasons for departing 
from it. Later I will occasionally indicate how the 
present approach relates to it or differs from it. 

In brief, avoidance theory has required that some 
stimulus, called a warning stimulus or conditioned 
suumulus, be paired with primary aversive stimulation 
such as electric shock. Through this pairing the warn- 
ing stimulus is said to become aversive. Then, an overt 
response that is allowed to prevent the shock can also 
produce an immediate effect, terminating the warning 
stimulus. In some versions of the theory, the warning 
stimulus is identified as a Pavlovian conditioned stim- 
ulus, producing a conditioned internal response; re- 
moval of the stimulus then terminates the conditioned 
response. In all versions, responding is said to be rein- 
forced by its immediate effects, and only incidentally 
to prevent the primary stimulation. Paradoxically this 
is to assert that avoidance (as prevention of absent 
aversive stimulation) is not a basic process; rather that 
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such prevention is always the byproduct of escape 
from a present conditioned stimulus. 

The present account departs from traditional 
theory, partly because warning stimuli are sometimes 
not aversive as predicted by the theory. Also, warning 
stimuli have multiple properties that the theory does 
not predict. These facts will be documented below. 

There are more general reasons for taking a differ- 
ent approach. Avoidance theorizing has conformed to 
a prejudgement that the reduction of aversive stimula- 
tion must be immediately discriminable upon occur- 
rence of a response, if that reduction is to reinforce 
the response. Thus in the prototypical procedure 
sketched above, the warning stimulus is included to 
provide an immediately discriminable effect. When 
avoidance procedures have not provided explicit 
immediate consequences, their interpretations have 
focused on plausible surrogates for the warning stim- 
ulus. The surrogates have been drawn from overt be- 
havior, or covert behavior, or internal time-correlated 
stimuli, but have always been said to function in a 
manner like that described above for externally-sup- 
plied warning stimuli. Explanation of avoidance, 
then, has been mainly an explanation of how an 
animal bridges gaps in time between a response and 
the consequent non-occurrence, or reduced occurrence 
of the aversive stimulation. ‘This focus came prior to 
any appreciable examination of the range of situations 
in which negative reinforcement is effective, 

In defense of the strategy of beginning with 
theoretical notions of mechanism, it can be argued 
that the theory provides a means for summarizing and 
organizing data. Further, when carried out with pre- 
cision as in mathematical models, this approach car- 
ries with it a formal evaluation of explanatory as- 
sumptions. However, when applied to avoidance 
theory this increased rigor has been accompanied by 
restriction to small ranges of data, gathered with im- 
poverished sets of procedures (e.g. Hoffman, 1966; 
Theios, 1971). The more common, verbal postulations 
of avoidance mechanisms deal with more data, but 
as Hoffman (1966) and Church (1973) have noted, 
these accounts have at best been informal and impre- 
cise. In both cases, the focus on questions of mecha- 
nism has tended to constrain the range of procedures 
that are studied. 

The present account is still concerned with orga- 
nizing principles. However, I will focus on external 
controlling variables. Traditional avoidance proce- 
dures will be included, as part of a continuum that in- 
cludes situations where behavior is maintained by 
immediate consequences, as well as situations where 
its apparent controlling consequences are remote in 


365 


time. I will begin by describing two experiments. One 
illustrates the fact that procedures conventionally 
labeled and studied as avoidance are very limited 
representatives of what this term might include. ‘The 
second experiment suggests that negatively reinforced 
behavior may be more similar to its positively rein- 
forced counterpart than is customarily assumed. I will 
then consider the forms that negative reinforcement 
procedures have taken, first defined in discrete-trial 
procedures, and then extended to free-operant pro- 
cedures in which only the aversive stimuli are manip- 
ulated. Next, I will consider procedures in which cues 
are added, providing for stimulus control of negative 
reinforcement, and changing the way in which inter- 
mittent aversive stimulation affects behavior. Finally, 
I will deal with some issues related to the shaping of 
particular responses with negative reinforcement. 

The following are some major points: The funda- 
mental operations in negative reinforcement proce- 
dures have been shock-delay and shock-deletion. In 
shock-delay, the timing schedules tor shock aré reset 
by responses. In shock-deletion, the timing schedules 
proceed independently, but responses can cancel sheck 
deliveries. In both of these, shock-frequency reduction 
appears to be a major controlling variable. When 
additional stimuli are provided, their functions de- 
pend on which procedural features they are correlated 
with. For example, added cues can control behavior 
through correlations with contingencies of réinforcee- 
ment, irrespective of the presence or absence of shock. 
Added cues can also control behavior through their 
correlation with differing rates of shock. A distinction 
can be made between reinforcement due to a change 
of situation, and reinforcement due to shock-frequency 
reduction within a situation. Added stimuli can also 
be used to isolate particular variables such as shock- 
frequency reduction and short-term delay of shock. 
For the initial shaping of behavior with negative rein- 
forcement, one may encounter apparent constraints 
on conditioning. However, this “characterization by 
deficit’ is of little help. Additional principles may be 
needed to describe the maintenance of the ongoing 
stream of behavior upon which negative reinforce- 
ment must operate. 


TWO ILLUSTRATIVE EXPERIMENTS 


Broadening the Range of Avoidance Experiments 


Consider first an experiment that suggests a way to 
systematically generate and examine a rich variety 
of behavioral situations that might be called avoid- 
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Fig. 1. Schema for a procedure using aversively maintained mul- 
tiple operants. With no responding, the sequence progresses 
as indicated at the top of the diagram, with shocks accompany- 
ing the three-sec red light. As indicated, ratio schedules in effect 
during the red and green conditions permit shock-deletion and 
blackout for greater portions of the fixed 90-sec cycle. (After 
Krasnegor, Brady, & Findley, 1971. © 1971 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


ance. ‘he experiment, by Krasnegor, Brady, and Find- 
ley (1971), used Rhesus monkeys, and was based on an 
abortable sequence of events diagrammed in Figure 1. 
Initial training was accomplished in phases, with the 
final procedure involving three stimuli in sequence on 
a recurring 90-sec cycle. If the confined monkey did 
not press a lever, a blue light was present for 30 sec, 
followed by a green light for 30 sec, followed by a 3- 
sec red light that was accompanied by three, brief, in- 
escapable shocks, followed by 27 sec of darkness, or 
timeout, until the beginning of the next cycle. Thirty 
responses in the presence of either the green light or 
the blue light, with the count starting from zero at 
each stimulus change, produced timeout for the re- 
mainder of the 90-sec cycle. If fewer than 30 responses 
occurred in both the blue and green 30-sec periods, 
the red 3-sec light appeared with its 3 inescapable 
shocks. ‘Thus, sufficiently rapid and persistent respond- 
ing during either the blue or green light could abort 
the sequence, producing timeout for the remainder of 
the current 90-sec cycle. Stable performances were ob- 
tained in which the subjects received few shocks. One 
monkey responded primarily in the presence of the 
green light, which was the situation proximal to the 
red light and shocks. The other monkey completed 
the ratios about equally often in the proximal and 
distal (green and blue) situations. When the size of 
either response requirement was varied while the 
other was held constant, the performances of both 
monkeys gave similar functional relationships. As 
shown by a comparison of the two parts of Figure 2, 
responding in the distal situation dropped off more 
quickly as a function of fixed ratio size in that situa- 
tion (Figure 2B) than it did in the proximal situation 
when that ratio was increased (Figure 2A). The main 
effect of varying the response requirements in the 
proximal or distal situations was to transfer the be- 
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havioral output toward the one requiring fewer re- 
sponses, with a bias toward responding in the situation 
closer to shock. 

This experiment indicates a promising approach 
for investigators who wish to make “avoidance” a 
primary concern. Krasnegor, Brady, and Findley’s 
procedure could be expanded or extended, not only to 
examine a wider range of response requirements, but 
also to examine “reversing chains” in which respond- 
ing in a proximal situation would reinstate a more 
distal situation. For example, the procedure described 
above could be modified so that responding in green 
would put the subject back in the presence of the blue 
light. I shall describe experiments with this feature 
later, in the context of stimulus control. One might 
also study choice and preference within this type of 
procedure by use of branching reverse sequences, such 
that differing, more-or-less distal situations could be 
made contingent upon different responses. This fol- 
lows a strategy similar to that outlined by Findley 
(1962) for the study of appetitively-maintained be- 
havior. The resulting procedures would more closely 
resemble what is commonly called “avoidance” than 
have traditional avoidance procedures. 


Comparing Negative with Positive Reinforcement 


My second illustrative experiment is by Kelleher 
and Morse (1964), who compared responding main- 
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Fixed Ratio Requirement 


Fig. 2. Part A shows results of manipulating the fixed-ratio 
requirement in the proximal (green) stimulus while the ratio 
size in the distal (blue) stimulus remained constant at FR 30. 
Each data point represents the mean number of ratio comple- 
tions for the last five sessions at each value of the ratio in green. 
Part B shows results of manipulating the FR requirement in 
the distal (blue) stimulus while the ratio requirement in the 
proximal (green) stimulus remained constant at FR 30. Each 
data point represents the mean number of ratio completions 
for the last five sessions at each value of the ratio in blue. The 
differing data points (filled squares vs. open circles) denote 
different monkeys. (After Krasnegor, Brady, & Findley, 1971. © 
1971 by the Society for the Experimental Analysis of Behavior, 
Inc.) 
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tained by food presentation with responding main- 
tained by termination of situations, denoted by dis- 
tinctive visual cues, in which intermittent shocks 
occurred. ‘The latter consequence can also be character- 
ized as production of a discriminable shock-free 
situation. ‘Three squirrel monkeys were trained on 
conventional multiple schedules of positive reinforce- 
ment in which a 10-min fixed-interval (FI) schedule 
alternated with a fixed-ratio (FR) schedule that re- 
quired 30 responses. ‘he schedules were accompanied 
by white and red lights, respectively, and exposures to 
these schedules were separated by timeout periods 
denoted by a visual pattern of horizontal bars. ‘Three 
similar monkeys were trained on an analogous pair of 
schedules of negative reinforcement: in the FI sched- 
ule, brief shocks were scheduled to occur once per sec, 
beginning when the white light had been present for 
10 min; the first response after the 10-min_ period 
produced the pattern of horizontal bars which de- 
noted a situation with no shocks, and no response 
contingency. In the presence of the red light, brief 
shocks were scheduled to occur once every 30 sec; the 
30th response in the presence of this light terminated 
the light and also produced the timeout stimulus. As 
shown in Figure 3, patterns of key-pressing main- 
tained by these alternating FI and FR schedules were 
nearly identical for the appetitively-maintained and 
the aversively-maintained procedures. With this ac- 
complished Kelleher and Morse administered D- 
amphetamine and chlorpromazine, separately and 
with systematic variation of doses. A detailed descrip- 
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Fig. 3. Performance maintained by reinforcement with food 
(upper records) compared to performance maintained with 
electric shock (lower records) under multiple FI FR schedules. 
The large excursions were produced when the 10-min FI 
schedules were in effect; the smaller excursions were produced 
on the 30-response fixed-ratio schedules. At reinforcement, 
either through food delivery or shock prevention, the cumula- 
tive recording pen reset to the bottom of the record. (From 
Kelleher & Morse, 1964.) 
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tion of the drugs’ effects is given in Chapter 7 of this 
book. For the present purpose, these effects can be 
easily summarized: While the two drugs had differing 
effects on behavior, the shock-maintained and food- 
maintained responding were very similarly affected by 
a given drug. In these situations the kind of conse- 
quences maintaining the behavior were less important 
than the specific pattern of behavior. This result 
recommends that appetitive/aversive distinctions not 
be taken for granted, nor too much predicated on 
them. Negatively-reinforced behavior and behavior 
characterized as avoidance may have much in common 
with behavior not so categorized. Nevertheless, the 
procedures for negative reinforcement differ formally 
from those of positive reinforcement, and the effects 
of negative reinforcement procedures will be the main 
concern here. 


NEGATIVE REINFORCEMENT 
WITHOUT ADDED CUES 


The Escape Procedure 


A simple procedure for negative reinforcement is 
that of escape. In this procedure an aversive stimulus 
is presented and some aspect of the subject’s behavior, 
which the experimenter specifies as a response, can 
terminate that stimulus. The stimulus is identified as 
aversive if the result of its termination is an increased 
probability or decreased latency of the response when 
that stimulus is again present. In most experiments on 
negative reinforcement the aversive stimulus has been 
electric shock. Other stimuli, such as intense light 
(Keller, 1941), loud noise (Harrison & Tracy, 1955), 
rotation (Riccio & Thach, 1966), temperature change 
(Weiss & Laties, 1961), and centrifugal force (Clark, 
Lange, & Belleville, 1973) have been used with varying 
success in aversive conditioning procedures, but the 
present chapter will deal mostly with shock since it 
has been almost universally used in systematic work. 

In the escape procedure, behavior in the absence of 
an aversive stimulus has no effect on its subsequent 
recurrence. Such responding is typically ignored. 
Thus, escape conditioning is aptly described as a dis- 
crete-trial procedure, where the presence of the 
aversive stimulus defines a trial during which the 
subject’s behavior is under study. Yet, while a trial is 
easily identified it is not a simple event. Three of its 
components will be distinguished here; as shown in 
Part I, of Figure 4, they coincide in time on the proto- 
typical escape procedure, but can be viewed as inde- 
pendent. In the discussion that follows, these features 
will provide a basis for relating the escape procedure 
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I. Shock-Delay with Continuous Shock in Absence of Responding 
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I. Shock-Delay with Brief Periodic Shocks in Absence of Responding 
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Big. A. Diagram illustrating three features as they apply te cach 
of three procedures, with and without responding. Time is 
indicated linearly from left to right: upward displacetnent of a 
line indicates whcn a given feature is in effect. An “R” indicates 
vesurrence of a response. Part I illustrates the basic escape 
conditioning procedurs, Part II illustrates a shock-delay pro- 
cédure in which failure to respond results in continuous shock, 
Part IIT illustrates 4 more typical shock-delay procedure of the 
kind devised by Sidman (1953). in which the RS interval is twice 
the 3S interval. 


to other negative reinforcement procedures and for 
interrelating those procedures. The escape trial in- 
cludes: a) An aversive situation. In the basic escape 
procedure the aversive situation is defined by contin- 
yous presence of the aversive stimulus, b) An oppor- 
tunity for responses to occur and to be counted. In 
the basic escape procedure the opportunity is often 
controlled by closing a door or removing a response 
lever; typically the opportunity for responding is 
terminated with the first response of the trial. c) An 
occasion when the specified response can affect the 
occurrence of aversive stimulation. One can relate 
these features to those of positive reinforcement situa- 
tions: a) Providing an aversive situation is a manip- 
ulation that potentiates reinforcement. This feature 
identifies it as a “drive operation,’ analogous to 
deprivation procedures that are often used to poten- 
tiate positive reinforcement. Of course its discrimina- 
tive properties differ from those of a deprivation 
procedure; I will discuss these properties later. b) The 
Opportunity for responses to occur and be counted is 
typically terminated with the occurrence of one re- 
sponse. Given that the opportunity coincides with c), 
the occasion during which reinforcement is “set up” 
or available, this limited opportunity provides for 
continuous reinforcement (crf), or reinforcement on 
FR 1. To the extent that in positive reinforcement 
situations the observed response is precluded during 
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positive reinforcement—key-pecking is improbable 
when the grain hopper is accessible—typical negative 
reinforcement and positive reinforcement situations 
are similar. ‘They differ in that positive reinforcement 
durations are usually shorter than negative reinforce- 
ment durations. 


Free Operants and the Escape Paradigm 


Although the escape procedure of textbooks is 
usually described as requiring only a single response 
to terminate the shock and to produce an intertrial 
interval, where neither a) , b) nor c) is operative, Dins- 
moor and his colleagues have demonstrated schedules 
of negatiye reinforcement superimposed on the basic 
escape procedure. They allowed the opportunity for 
responding whenever the shock was present. Features 
a) and b) coincide as before, but c), the period during 
which responses could affect shocks, was restricted 
(Dinsmoor, 1967). As in appctitive schedules, rein- 
forcements were set up during only parts of the 
periods when responding could occur. The resulting 
behavior on basic ratio and on interval schedules 
resembled that of comparable schedules of positive 
reinforcement. 

So far I have said little about intervals when 
neither a) nor c) is operative. Of course, these inter- 
trial intervals are of only peripheral interest if they 
include no opportunity to respond and other behavior 
is not observed. However, if a completely free operant 
is allowed and responding can and does occur between 
trials that it cannot affect, we have a situation that 
can be compared with other free-operant procedures. 
The shock can function as a discriminative stimulus, 
delineating the availability of reinforcement as well as 
providing the basis for it. This resembles an appetitive 
discrimination procedure where in the presence of a 
discriminative stimulus (S“), responses are reinforced 
according to some schedule, with extinction in the 
absence of the discriminative stimulus (S4). Keehn 
(1966) has noted patterns of intertrial responding 
that suggest such an interpretation, with discrete-trial 
negative reinforcement procedures seen as S4S4 dis- 
crimination procedures. 


The Escape/Avoidance Distinction, and a 


Preview of the Present Approach 


Sometimes in appetitive discrimination procedures, 
a delay contingency is imposed during S4 so that re- 
sponding during this time prevents the onset of S4 
Such a delay contingency can be added to the escape 
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procedure as well, and it has been done most simply 
in one version of the well-known shock-delay pro- 
cedure devised by Sidman (1953)—the version with a 
shock-shock interval of zero. As diagrammed in Part II 
of Figure 4, in the absence of responding continuous 
shock is delivered. A response can terminate this 
shock for a period known as the response-shock (RS) 
interval; such a response is negatively reinforced by 
removal of shock. In addition, the opportunity to re- 
spond is continually present, and responses in the 
absence of shock delay the onset of shock. The shock 
will resume only if the response-shock interval elapses 
without an intervening response to restart the timing 
of this interval. In the appetitive case such a delay 
contingency reduces the future probability of the 
response to which it is applied. In the aversive case it 
has the opposite effect; responding in the absence of 
shock is maintained by delaying the onset of shock. In 
the present development of procedures the added de- 
lay feature introduces a new level of complexity. In 
previous procedures, whenever a response occurred it 
either removed the aversive stimulus or had no eftect 
on aversive stimulation. Now, different consequences 
are produced by different classes of responses. 

Within the tradition of aversive conditioning the 
presence vs. absence of shock has been the major dis- 
tinction between escape and avoidance. Escape re- 
sponses remove shock; avoidance responses prevent its 
occurrence. The validity of this distinction has been 
supported by experiments in which two different 
responses were independently maintained, one by re- 
moval of shock, and the other by delay or prevention 
of shock (e.g., Boren, 1961; Mowrer & Lamoreaux, 
1946). The present account replaces this aspect of the 
escape /avoidance distinction with a more general con- 
cept of multiple contingencies of reinforcement under 
stimulus control. Instead of the $4 — S4 discrimina- 
tions of the escape procedure (shock vs. no shock) 
where responses are reinforced in the presence of one 
stimulus and extinguished in the presence of another, 
there are two S¢s. A response in the presence of shock 
both removes the shock and produces a shock-free 
interval. Responses in the absence of shock delay it, 
extending the shock-free interval. As with the escape / 
avoidance distinction, if these two consequences are 
made contingent upon separate responses, the sep- 
arate responses will be independently maintained. 
The separate effects, however, are not seen as resulting 
from distinct processes. Rather, they result from the 
fact that in discriminably different stimulus situa- 
tions, different responses are reinforced. In this re- 
spect, it is like a positive reinforcement procedure 
where the presence of one light denotes a given sched- 
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ule of reinforcement, and the presence of another 
light denotes a different schedule of reinforcement. 
‘Thus to the extent that the different negative rein- 
forcement contingencies operate independently, de- 
noted by distinctive cues, I will treat them like those 
of any other multiple-contingency procedure. 

Later I will describe procedures that use intermit- 
tent brief shocks, producing situations that are not 
easily discriminable, but that involve differing rein- 
forcement contingencies. In dealing with these I will 
distinguish between A) reinforcement by reduction of 
the density of aversive stimulation within a situation, 
and B) reinforcement by discriminable change of 
situation. Both categories include procedures that 
have been called “avoidance.” The placing of a given 
procedure in category A) or category B) will depend 
upon added cues as well as upon characteristics of the 
aversive stimulation itself. The reasons for making 
this distinction will become evident later. For the 
immediate present, I shall describe procedures that 
cut across both escape and avoidance. ‘These proce- 
dures require no special cues other than the shocks 
themselves. They involve shock-delay or shock-dele- 
tion, based on fixed or on variable time intervals. 


The Continuum of Shock Density or Frequency 


Electric shock continuously delivered may not be 
continuously received. For example, if it is grid shock 
the animal may produce intermittency by jumping up 
and down. Nevertheless the escape procedure is 
treated as a clear case of negative reinforcement by 
removal of shock. ‘The experimenter may even arrange 
an escape procedure by explicitly presenting intermit- 
tent pulses of shock several times per second, rather 
than presenting it continuously, But if shock is pre- 
sented several times per second, why not just twice per 
second, or once per second, or even less frequently? At 
some point we tend to stop labeling it continuous 
shock, and call it a stream of shocks. Responses are 
reinforced by interruption of (escape from) a stream 
of shocks. But as the pulses of shock are spaced out 
still further, to one every five, ten, or twenty seconds, 
we tend to characterize suspension of this situation 
not as removal of shock, or as interruption of a 
stream of shocks, but as reduction in shock frequency 
or density. One aspect of the shock delivery procedure 
that affects this characterization is the variability of 
the time between pulses of shock. Regularly spaced 
shocks seem to be appropriately characterized as 
streams or sequences until they become quite far 
apart. Irregularly spaced shocks are difficult to specify 
without reference to a distribution, whose measure of 
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central tendency translates into frequency or density. 
Frequency and density are nearly equivalent. One 
stresses the number of shocks per unit time; the other 
stresses the time between events, but can also refer to 
the intensity and duration of those shocks. 


SHOCK-DELAY PROCEDURES 


The various forms of Sidman’s well-known free- 

operant shock-delay procedure (Sidman, 1953) illus- 
trate the density continuum. Diagrammed in Part III 
of Figure 4, the procedure is a more general form of 
the “‘escape-avoidance” procedure described above. In 
the absence of responding, instead of continuous 
shock, brief shocks are delivered periodically. These 
shocks are typically from 0,1 to 0.5 sec in duration, 
and are separated by the shock-shock (SS) interval. 
The shock-shock interval ranges from zcro to more 
typically 3, 5, 10, or 20 sec. A single response inter- 
rupts this sequence for a time known as the response 
shock (RS) interval. With no additional intervening 
responses, occurrence of the next shock reinstates the 
SS interval as the determinant of shock delivery. How- 
ever, if additional responses occur before the RS inter- 
val has elapsed, each one “resets the clock,” restarting 
the timing of the RS interval. In the terms of the 
preceding pages, a short $$ interval denoting virtually 
continuous shock might be considered as defining the 
aversive situation, while the RS interval defines its 
absence. A response during the brief SS interval pro- 
duces a relatively long R§ interval, effectively remov- 
ing shock as in a standard escape procedure. However, 
in procedures where the SS interval approaches the RS 
interval, the term “escape” no longer seems appropri- 
ate; a response during a long SS interval can be said 
to delay shock just as responses during the RS inter- 
val do. 

While the 5S and RS intervals do not cleanly dis- 
tinguish escape from avoidance, or even define the 
presence versus absence of aversive situation, they do 
have distinguishable effects on behavior. The effect of 
each is partly determined by the value assigned to the 
other. For example, Sidman (1962a) has noted that 
acquisition of lever-press responding is more easily 
achieved if the SS interval is substantially shorter than 
the RS interval. Leaf (1965) documented this effect 
with a between-group comparison in which each an- 
imal was run for only one session. Using a 20-second 
RS interval combined with SS intervals of 1, 3, 5, 10, 
or 20 seconds, he found consistent acquisition with 
SS intervals of 5 sec or less. Acquisition was less con- 
sistently obtained with the SS interval of 10 sec; and 
an SS interval equaling the 20-sec RS interval was 
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only marginally effective. In contrast, Clark and Hull 
(1966) reported reliable acquisition with equal RS and 
SS intervals of 60 sec or more. However, in unpub- 
lished work I have been unable to replicate this lat- 
ter result. 

Once responding is established, RS and SS intervals 
affect response rates somewhat differently, as shown 
by one of Sidman’s early experiments (Sidman, 1953). 
Sidman measured overall response rates while he 
varied the RS interval, holding the SS interval con- 
stant for each series of RS values, and changing the 
SS interval between series. In this way he obtained a 
family of functions for each subject; data from one of 
the rats is presented in Figure 5. With SS intervals 
of 5 sec or greater, maximal response rates occurred 
when the RS interval was equal to or slightly shorter 
than the SS interval. Response rates dropped off gradu- 
ally as the RS interval was increased beyond the SS 
interval, giving plots that were concave upward. Re- 
sponse rates dropped off more quickly as the RS inter- 
val was decreased, giving plots that were concave 
downward, tending sharply toward zero with small RS 
values. With smaller $S intervals, where there was lit- 
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Fig. 5. One subject’s rates of lever-press responding as response- 
shock intervals were varied in relation to constant shock-shock 
intervals. Each column of the table indicates a series obtained, 
in irregular order, with a given SS interval. Some of the series 
shown in the table are plotted in the curves. (After Sidman, 
1953.) 
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tle room to manipulate the RS interval at values be- 
low the SS interval, only the concave-upward portion 
of the curve was obtained. 

More recently, Clark and Hull (1966) examined the 
maintenance of overall response rates by procedures 
with equal RS and SS intervals, varying these inter- 
vals together. While they used preselected rats that 
had produced especially low shock rates during prior 
experimentation, the general features of their results 
agree with other less complete, but comparable data. 
As shown by the open symbols on the left side of 
Figure 6, increases in the RS = SS interval between 
10 sec and 60 sec produced decreases in response rate 
according to a roughly hyperbolic function, trans- 
formed to roughly linear plots by using the recipro- 
cals of the intervals. Clark and Hull noted that trans- 
formations other than the reciprocal, such as semilog 
and log-log transformations also produced approx- 
imately linear plots of response rate as a. function of 
RS = 5S interval. They found no compelling basis for 
selecting one of these transformations over another. 

When Sidman (1953) held the SS interval constant 
at 2.9 seconds, varying the RS interval through larger 
values he obtained response rate changes comparable 
to those obtained by Clark and Hull. This is demon- 
strated by the plot with asterisks on the left side of 
Figure 6. These points were obtained by replotting a 
representative set of Sidman’s data from Figure 5, 
in terms of reciprocals of the RS interval, Thus, as 
Sidman concluded from the experiment represented 
by Figure 5, the SS interval apparently has little effect 
in determining the shape of the rate versus RS func- 
tion, provided that the RS interval is not substantially 
smaller than the SS interval. This is not a surprising 
outcome for well-conditioned animals that eliminate 
most shocks; the few shocks that they receive would 
be initiated by the Response-Shock timer. The results 
obtained by Clark and Hull with equal RS and SS in- 
tervals show relationships that would be expected on 
any schedules in which the RS interval exceeds the 
SS interval, at least for animals that eliminate most 
shocks. 

The right side of Figure 6 shows shock rates that 
correspond to the response rates just considered. The 
open points indicate roughly linear increases in 
received shock rates as Clark and Hull increased the 
maximum possible shock frequency. However, for 
most animals, received shock rate was not simply a 
percentage of maximum shock rate. Clark and Hull 
reported that as the maximum shock rate was re- 
duced to low values—corresponding to long SS = RS 
intervals—the percent shock reduction increased 
and more responses were emitted per shock received. 
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This is supported by plots on the right side of Fig- 
ure 6 that tend toward zero to the right of the origin. 
The departure from a simple percentage relationship 
was not large. It appears reliable, however, for Sid- 
man (1953) also reported a comparable effect when 
increasing the RS interval and using the reciprocal of 
the RS interval in place of maximum shock frequency. 


SHOCK-DELETION PROCEDURES 


A different type of negative reinforcement pro- 
cedure provides continuous opportunity to respond 
and also incorporates much of the continuum, from 
pulse streams to widely spaced shocks. Instead of pro- 
viding for a response to reset the timing interval and 
thus delay shock, these procedures allow a response to 
cancel or delete an impending shock without affecting 
the time cycles for shock delivery. ‘The basic charac- 
teristics are readily evident in a fixed-cycle procedure 
described by Sidman (1966), As the label implies, the 
procedure is based on a timing cycle that progresses 
independently and constantly, regardless of the sub- 
ject’s behavior. With no responding, each timing cycle 
ends with a brief inescapable shock and starts again. 
The first response in a cycle cancels the shock due at 
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Fig. 6. Response rates and obtained shock rates as functions of 
the maximum shock rates possible under shock-delay and shock- 
deletion procedures. Each open symbol represents a pair of 
equal shock-shock and response-shock intervals on Sidman’s 
shock-delay procedure, as used by Clark and Hull (1966). Differ- 
ently shaped data points represent performances of different 
rats. The plot with asterisks shows response rate as a function 
of the reciprocal of the RS interval (which, for this set of 
intervals does not equal maximum shock rate), with the SS 
interval held constant at 2.5 sec. These data, obtained by 
Sidman (1953), are taken from Figure 5. The plots with filled 
symbols show response rates and received shock rates as func- 
tions of maximum shock rates on a shock-deletion procedure 
based on variably spaced shocks (after de Villiers, 1974). Most 
points show the mean of two determinations for a given VI 
schedule: one from an ascending, and one from a descending 
series of VI values. 
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the end of that cycle; additional responses during the 
cycle have no effect. ‘Thus, as indicated in Part I of 
Figure 7, the opportunity to respond is continuous, 
but the occasion for a response to affect shock is lim- 
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Il. Variable-Cycle Deletion 
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Il. Fixed-Cycle Deletion with t* Interposed 
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Fig. 7. Diagrams showing the relations between features of 
shock-deletion procedures. A given feature is in effect when its 
corresponding line is displaced upward. A straight line indicates 
that a given feature is continuously in effect. An “R” indicates 
the occurrence of a response. Part I illustrates a procedure 
based on fixed-cycle shock delivery (after Sidman, 1966). Part 
If illustrates a procedure that is formally identical except it is 
based on variable-cycle shock delivery (after de Villiers’s 1974 
“variable-interval” shock deletion). Part III illustrates a fixed- 
cycle procedure with imposed t* periods when responding 
cannot affect the impending shock. On this procedure, tPD indi- 
cates periods when responding is effective. In this example t4 
and tD are of equal duration. Their relative durations can be 
varied (after Hurwitz & Millenson, 1961). In all three parts 
of the figure, sequences of events with and without responding 
are portrayed. The dashed lines identify points at which re- 
sponding has resulted in deletion of shock. 
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ited to one effective response per cycle. This pro- 
cedure has not been studied as extensively or as sys- 
tematically as has the shock-delay procedure, but it is 
clearly effective for producing and maintaining lever- 
press responding. ‘The fixed-cycle shock-deletion pro- 
cedure is similar to the shock-delay procedures con- 
sidered above in that both use fixed shock-shock 
intervals, and both allow frequent responding to elimi- 
nate all shocks. The two procedures differ in that all 
responses on the delay procedure affect shock, while 
on the deletion procedure not all responses are effec- 
tive. They also differ with respect to which relations 
between responding and shock are fixed, and which 
are variable. In the delay procedures the interval be- 
tween response and shock is fixed. The amount of 
shock reduction resulting from each response varies 
with the subject’s spacing of responses. In deletion 
procedures the subject’s spacing of responses affects 
the interval between response and shock, but the 
amount of shock reduction is tightly controlled; an 
effective response eliminates exactly one shock. Closely 
spaced responses are relatively ineffective in both: A 
response that closely follows another can produce only 
a small increment in shock delay. There is a low 
probability of deleting shock, since it is likely to fall 
in the same cycle as the earlier response. 

Shock-delay procedures need not use constant time 
intervals. Indeed, Sidman and Boren (1957a) modified 
the basic delay procedure by changing the response- 
shock interval after each shock. This procedure read- 
ily produced both acquisition and maintenance of 
lever-pressing. Bolles and Popp (1964) used a delay 
procedure in which the response-shock interval was 
constant but the shock-shock interval varied from one 
shock to the next, with a mean value slightly under 
seven seconds. They compared acquisition on this 
procedure to that on a standard delay procedure 
where the RS interval was 15 sec and the SS interval 
was constant at five sec. This was a between-subject 
comparison, and their small number of rats prohib- 
ited concluding that the variable shock-shock interval 
was superior to the fixed one. However, the observed 
differences were clearly in that direction. In unpub- 
lished work I have found consistently good acquisi- 
tion and maintenance of lever pressing on a procedure 
where both the response-shock and shock-shock inter- 
val varied randomly after each response and each 
shock. Clearly, fixed intervals are not critical to the 
effectiveness of the shock-delay procedures. 

The shock-deletion contingency, where timing cy- 
cles are independent of behavior, is also readily 
adapted to variable shock-shock intervals, as shown by 
de Villiers (1974). His procedure was identical to Sid- 
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man’s fixed-cycle shock-deletion procedure described 
above, except that the shock-shock interval varied 
from cycle to cycle. In the absence of responding the 
rats received brief shocks, irregularly spaced so that 
the probability of a shock was roughly constant over 
time. The first response in an SS interval deleted the 
shock due at the end of that interval; additional re- 
sponses within the interval had no effect. When the 
time for a (deleted) shock was passed, the next re- 
sponse would again be effective. A plausible sequence 
of events on this procedure, showing shocks delivered 
by identical random scheduling sequences with and 
without responding, is shown in part II of Figure 7, 
De Villiers first established responding with a sched- 
ule whose mean shock-shock interval in the absence of 
responding was 15 sec. He then varied this mean in- 
terval between 15 sec and 60 sec, over blocks of ses- 
sions. Ihe resulting response rates are plotted in- 
dividually for the four rats, with the filled symbols in 
Figure 6. These plots represent combined data from 
increasing and decreasing series, which de Villiers 
presented separately. With maximum shock frequency 
as an independent variable common to the two types 
of procedures, de Villiers’s results are readily compared 
with the results obtained with fixed shock-delay pro- 
cedures by Clark and Hull (1966). Interestingly, al- 
though absolute response rates are much higher for 
two of de Villiers’s four animals, response rates versus 
maximum shock rate gave roughly similar functions 
in both cases. The “received shock frequency” plots 
for the two procedures show even greater similarity 
than the response rate plots. However, systematic re- 
lationships based on responses emitted per shock re- 
ceived and percent shock reduction as a function of 
maximum shock frequency observed on shock-delay 
procedures both by Sidman and by Clark and Hull 
were not consistently obtained on this procedure. 


SHOCK-FREQUENCY REDUCTION AS A 
CONTROLLING VARIABLE 


While response-shock intervals, shock-shock inter- 
vals, and maximum shock rate are straightforward as 
independent variables, none of these by itself ade- 
quately describes the consequence of responding when 
procedures are based on brief intermittent shocks. De 
Villiers addressed the problem of specifying the con- 
sequence of responding with a single expression, and 
at the same time provided evidence for functional 
similarity between positive and negative reinforce- 
ment. He also provided an additional means for com- 
paring the effects of shock-delay and shock-deletion 
procedures. De Villiers started with a relationship that 
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Response Rate 


Shock Frequency Reduction 


Fig. 8. Reciprocals of response rates plotted against reciprocals 
of shock-frequency reductions on shock-delay and shock-deletion 
schedules. Part A shows data obtained by de Villiers (1974) 
with a variable-cycle shock-deletion procedure. Part B shows 
data obtained by Glark and Hull (1966) with shock-delay 
schedules in which SS and RS intervals were held equal as they 
were varied over blocks of sessions. These plots are based on 
the same data that were presented in Figure 6. 


Herrnstein had derived in studies of concurrent sched- 
ules of positive reinforcement and then developed to 
relate the rate of responding to the rate of reinferce- 
ment on single VI schedules (Herrnstein, 1970). A 
detailed treatment of this formulation can be found 
in Chapter 9 of the present volume. For the present 
purpose, sufhice it to say that Herrnstein’s equation 
predicts a linear relationship between the reciprocal 
of response rate and the reciprocal of the rate of rein- 
forcement.t Since assessment of linearity is simpler 
than other forms of curve fitting, the response rates 
are plotted as reciprocals in Figure 8, using the data 
from Figure 6. For both de Villiers’s data and those of 
Clark and Hull, shock-frequency reduction was com- 
puted for each animal on each schedule by subtracting 
the obtained shock rate from the shock rate that 
would occur with no responding. Part A of Figure 8 
shows that this formulation describes the performance 
of each of de Villiers’s four animals well; each plot is 
linear although the slopes differ for different animals. 
De Villiers tried substituting received shock in place 
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the rate of responding; ri is the frequency of reinforcement 
contingent on that response; k is the response rate if no alterna- 
tive responses were reinforced; and ro is the sum of reinforce- 
ment frequencies contingent on responses other than that 
described by Ri (whether or not these are measured by the ex- 
perimenter). If this equation is inverted it predicts a linear 
relation between performance and reinforcement: 
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of shock-frequency reduction, but found this produced 
greater departures from linearity. Interestingly, re- 
sponses of two of Clark and Hull’s three animals show 
a comparable degree of linearity when plotted in the 
same way, as shown in Part B of Figure 8. De Villiers 
has argued that a linear relationship would not be 
expected on fixed shock-delay procedures since short- 
term effects of spaced responding would override the 
overall effects of reinforcement frequency. Perhaps the 
equating of relative reinforcement rate to shock fre- 
quency reduction can be done more generally than 
de Villiers proposed. 

Both de Villicrs and Clark and Hull varied shock- 
frequency reduction only by manipulating the maxi- 
mum shock frequency which could occur in the ab- 
sence of responding. Their experiments implicate 
shock-frequency reduction mainly on the basis of its 
fitting a particular function when plotted in relation 
to response rates on these procedures. Other studies 
have more directly indicated shock-frequency reduc- 
tion as a controlling variable, by manipulating this 
variable independently of the maximum shock fre- 
quency, The first instance of this was accomplished 
inadvertently by Sidman (19624). Rats were con- 
currently exposed to two independent shock-delay 
schedules, using scparate response levers, Each sched- 
ule was controlled by a single timer that was reset by 
responses on its appropriate lever, recycling as it de- 
livered shock when no such responses had occurred. 
Hence, considered separately, each schedule was a con- 
ventional shock-delay procedure with equal RS and $5 
intervals. However, the two timers delivered indis- 
tinguishable shocks, so they combined to produce 
irregular shock sequences. Eyen when the two timers 
were given equal settings the rats tended to respond 
mostly on one lever, receiving frequent shocks from 
the schedule that required responses on the other 
lever. The rats’ choice not to distribute responses on 
the two levers effected a lower limit to shock-fre- 
quency reduction. When two timers were given un- 
equal settings, responding to the one with the greater 
interval would produce longer delays of individual 
shocks, but would result in relatively higher overall 
shock frequencies due to the continual recycling of 
the other timer. Responding to the lever with the 
shorter timer would produce shorter delays of in- 
dividual shocks, but if this responding was sufh- 
ciently frequent it would result in relatively lower 
overall shock frequencies. With these unequal set- 
tings, the rats tended to respond exclusively on the 
lever that gave a shorter shock-delay per response, but 
which permitted a greater decrease in overall shock 
frequency. This experiment then led Sidman to pro- 
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pose shock-frequency reduction as a critical variable. 

Following this lead, Herrnstein and Hineline (1966) 
devised a procedure that permitted direct and inde- 
pendent manipulation of both maximum shock fre- 
quency and the amount of shock-frequency reduction 
that responding could produce. This procedure, dia- 
grammed in Figure 9, was based on two independent 
random schedules for delivering brief shocks. ‘he 
schedules were generated by independently sampling 
two probability distributions once every two seconds. 
Averaging over longer time periods, the frequency of 
scheduled shocks was roughly constant over time, but 
the two schedules could have differing frequencies. In 
the absence of responding, the schedule with higher 
probability controlled the delivery of shock. A re- 
sponse transferred control to the schedule with lower 
probability, where the control remained until that 
schedule delivered a shock, at which time the control 
transferred back to the schedule with higher probabil- 
ity. Thus, one probability or the other was operative 
depending on whether a response or a shock had 
occurred last. On this procedure, a response could 
reduce shock frequency by a specihed amount, but it 
could not impose a shock-free period, for shocks could 
and did occur immediately after some responses. ‘This 
procedure permitted direct experimental manipula- 
tion of response-contingent shock-frequency reduction. 
Acquisition was reliably obtained even when the two 
shock schedules had probabilities as close as 0.2 vs. 0.3, 
giving mean shock frequencies of six and nine shocks 
per minute. However, the Herrnstein and Hineline ex- 
periment was mainly a study of acquisition, and in- 
cluded only a few combinations of shock frequencies, 
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Fig. 9. Diagram showing the relations between features of a 
procedure for response-contingent shock-frequency reduction. 
The punched tape advances at regular 2-sec intervals. Deflections 
on the lines marked “post-shock” and “‘post-response” indicate 
holes in these respective channels of the tape; probability of a 
hole is constant for a given channel, but the probability can 
differ between channels. Shocks are shown as deflections on the 
line indicated. “R” indicates a response. The delivery of shock 
coincides with the occurrence of a hole in the tape channel 
currently in control, which correlates with whether a response 
can affect shock. Control is changed from one channel to the 
other by a shock if a response has occurred since the last shock, 
and by a response if a shock has occurred since the last response. 
(From Herrnstein & Hineline, 1966. @ 1966 by the Society for 
the Experimental Analysis of Behavior, Inc.) 
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permitting no detailed analysis of response rate as a 
function of shock-frequency reduction. More system- 
atic data would be of interest, especially in relation 
to the work of de Villiers described just above. 

Returning to schedule classification, Herrnstein 
and Hineline’s procedure resembles the shock-deletion 
procedure described earlier (Part I of Figure 6), in 
that the time-base of the shock-delivery schedule pro- 
ceeds independently of behavior. No clocks are reset 
as would occur in shock-delay procedures. Unlike the 
earlier shock-deletion procedure, an effective response 
deletes shock only probabilistically, as determined by 
the difference between the two random shock-delivery 
schedules. This procedure also differs from the earlier 
shock-deletion procedures in that a response reduces 
the probability of shock, not only until the end of the 
timing cycle when a shock is deleted, but until a shock 
is delivered by the low frequency schedule. The 
amount of exposure to the low frequency shock sched- 
ule is determined by that schedule as well as by the 
amount of responding. 

If control were to revert to the high-frequency 
schedule at the end of a timing cycle irrespective of 
whether a shock had been deleted, the procedure 
would fit a descriptive system that has recently been 
gaining some currency. This formulation has been 
described by Church (1969), Catania (1971), Seligman, 
Maier, and Solomon ere 1), and by Gibbon, Berry- 
man, and ‘Thompson (1974), as a means for defining 
the degree of contingent relations between responding 
and rewarding stimuli or noxious stimuli. Neffinger 
and Gibbon (1975) have used the formulation to gen- 
erate negative reinforcement procedures with added 
cues. It could also be used to increase the generality 
of uncued shock-deletion procedures that have been 
described above (Part I of Figure 6) and enable these, 
like Herrnstein and Hineline’s procedure, to manipu- 
late shock frequency reduction independently of abso- 
lute shock frequency. The formulation simply is this: 
in any given time period, probability of shock, given 
a response, can be manipulated independently of the 
probability of shock given no response in that time 
period, On the standard, fixed-cycle shock-deletion 
procedure (Part I of Figure 6), the probability of 
shock given no response is 1.0; the probability of 
shock given a response is zero. This need not be the 
case; either probability can be manipulated between 
these values. So long as the probability of shock given 
a response is smaller than the probability given no 
response, negative reinforcement can occur. The pos- 
sible shock frequencies are determined jointly by 
probabilities per cycle, and cycle length. Cycle length 
sets the maximum shock frequency. Also, cycle length, 
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independently of probability, determines the mini- 
mum response rate needed to achieve a given degree 
of shock-frequency reduction. On Herrnstein and 
Hineline’s procedure, the minimal shock frequency 
could be achieved by a minimum of one response 
after each shock. On the probabilistic fixed-cycle 
shock-deletion procedures just outlined, a minimum 
of one response per cycle is required to achieve the 
minimal shock frequency, no matter when specific 
shocks have occurred. 

I noted earlier that shock density is an alternative 
specification to shock frequency. Density differs from 
frequency, partly in emphasizing the spacing between 
shocks rather than shocks per unit time, but also be- 
cause 1t can encompass the duration and intensity of 
shocks as well as the frequency of shock onset. For 
example, if shocks were to occur every five seconds, a 
reduction in either the duration or the intensity of 
shocks would count as a density reduction even 
though shock frequency remained constant. Powell 
and Peck (1969) demonstrated the potency of such a 
density reduction with a procedure that illustrates an 
interesting hybrid between shock-delay and shock- 
deletion. ‘Their schedule resembled shock-deletion in 
that the time-based shock delivery schedule proceeded 
independently of behavior. However, the effects of re- 
sponding were metered through a response-shock in- 
terval identical to that of a shock-delay schedule, as 
follows: Brief shocks were delivered every five sec- 
onds. Responses started a 20-second timer with the 
same contingent relationship as for Sidman’s delay 
procedure. However, instead of delaying shocks, start- 
ing this timer reduced the intensity of the one-per-five- 
second shocks. So long as 20 seconds did not elapse 
without a response, all shocks were delivered at the 
lower intensity. Powell and Peck found acquisition 
with this procedure to be more rapidly and reliably 
achieved than with a standard shock-delay procedure 
with SS = 5 and RS = 20. 


Schedules of Negative Reinforcement 
Based On Intermittent Shocks 


The shock-delay procedures considered so far have 
provided for every occurrence of the response to be 
effective. As will be seen below, schedules of reinforce- 
ment are easily achieved by imposing a ratio require- 
ment or by limiting the access to reinforcement. On 
the shock-deletion procedures already described, there 
have been intervals when responses were ineffective. 
However, these were periods of reinforcement, when 
impending shock had already been cancelled or re- 
duced in probability. Disabling the response lever at 
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these times is analogous to disabling the lever in the 
presence of a food reinforcer in appetitive procedures. 
Schedules of reinforcement by shock-deletion require 
modifying response effectiveness at other times as well. 

Hurwitz and Millenson (1961) did this, examining 
the maintenance of responding while they systemati- 
cally varied the access to reinforcement in the face of 
uncancelled shocks. ‘Their procedure, diagrammed in 
Part III of Figure 7, was conceived within a formula- 
tion developed by Schoenfeld and his associates 
(Schoenfeld and Cole, 1972). The formulation uses 
two time periods, designated t4 and t?, which alter- 
nate continually to produce fixed cycles, Positive rein- 
forcement on this schedule is customarily delivered 
unmecdiately upon occurrence of the first response in 
t”. Additional responses during t” are ineffective, as 
AVG thdsé diiring tA. In the negative reinforcement 
case arranged by Hurwitz and Millenson, the first 
response during t? deleted a shock that was scheduled 
to occur at the end of t?, With t4 equal to zero, as was 
the casc during initial training in Hurwitz and Mil- 
lenson’s experiment, this procedure is identical to the 
simple fixed-cycle shock-deletion procedure dia- 
grammed in Part I of Figure 7. As t4 is increased to 
values greater than zéro, the procedure becomes one 
in which the shock-deletion periods are alternated 
with periods during which responding cannot affect 
the impending shock. Hurwitz and Millenson held 
the sum of t4 and tP constant at 30 seconds, and 
systematically increased t4, changing its value every 
few sessions. The resulting response rates and shock 
yates aré shown in Figure 10, each eiving a systematic 
relation between response rate and the relative time 
that reinforcement was accessible, When most of the 
cycle was spent in t”, response rates were low, and 
many shocks were received. [he response rate func- 
tion obtained by Hurwitz and Millenson resembles a 
comparable function obtained by Hearst (1960) with 
analogous manipulations of a t* — t” schedule based 
on positive reinforcement. The two curves differ in 
the location of maximum rate, but not in their gen- 
eral shapes. 

Sidman (1962b) reported virtually identical results 
with the same procedure, using a 15-second cycle. 
However, when he departed from the usual t4 — t? 
procedures and moved the access period to the middle 
of the cycle, responses continued to be most probable 
near the end of the cycle, where shock was due. Sufh- 
cient mid-cycle responses occurred to produce a few 
shock deletions, but response rates soon dropped to 
near zero. ‘The eventual decrement in overall per- 
formance can be attributed either to the fact that 
placing access to reinforcement at mid-cycle permitted 
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shocks to occur immediately after some responses—late 
in the cycle when no response had occurred during the 
access period—or simply to a decrement in shock- 
frequency reduction. 

More recently, Kadden, Schoenfeld, and Snapper 
(1974) studied t4 — t? schedules of shock-deletion in 
which the probability of shock with a response in t4, 
and the probability of shock if no response occurred 
during t?, were manipulated independently. They 
used Rhesus monkeys as subjects, and after initial 
shaping of the response with removal of continuous 
shock, the sum of t4 + t? was always 60 seconds. They 
established stable response rates with a series of tem- 
poral combinations (t? = 45 sec, then 30 sec, then 6 
sec) in which probability of shock was ] given no 
response, and zero if a response occurred. ‘hese fea- 
tures resemble the procedures by Hurwitz and Millen- 
son, and by Sidman, described just above. Kadden et 
al. then independently varied the probability of shock 
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Fig. 10. (A) Number of shocks received per session; (B) lever- 
press response rates for four rats as a function of the temporal 


schedule parameter T. T is defined as tD/(tP + t4) and repre- 
sents here the relative portion of a 30-second shock-shock 
interval in the period during which the first response resulted 
in deletion of the next shock due. (From Hurwitz and Millenson, 
1961. © 1961 by the American Association for the Advancement 
of Science.) 
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given a response and the probability of shock given 
no response, presenting the values in different orders 
for different monkeys. Responding always ceased when 
the probability of shock was greater if a response oc- 
curred than if no response occurred. Also, responding 
ceased or dropped to very low levels when the prob- 
ability of shock given no response was reduced, pro- 
vided that the probability of shock given a response 
was greater than zero. Responding was more persistent 
when the probability of shock given a response was 
kept at zero; the monkeys seldom paused long enough 
to encounter the new consequences of not responding. 

Kadden, Schoenfeld, and Snapper’s procedure fits 
the contingency analysis of probabilistic fixed-cycle 
shock-deletion schedules that I noted earlier (p. 375). 
However, the insertion of large t4 periods into the 
cycle makes the procedure an interval schedule of 
reinforcement as well as a probabilistic contingency 
manipulation. Indeed, the patterns of responding 
within cycles revealed fixed-interval scheduling effects 
when shock deliveries permitted response patterns to 
stay in synchrony with the timing cycle. Thus, when 
probability of shock given a response exceeded zero, 
but was not high enough to eliminate responding, re- 
sponse rates increased as time for the t? period (as 
well as the possibility of another shock) approached. 

Shock-delay procedures have been adapted to pro- 
duce schedules even more closely resembling the tradi- 
tional basic positive reinforcement schedules. Ver- 
have (1959) imposed ratio schedules on a shock-delay 
procedure. He pretrained the animals with conven- 
tional shock-delay procedures, starting with SS = 3 
and RS = 30 sec, and then moving to SS = RS = 30 
sec. He then imposed ratio requirements so that more 
than one response was required during a given RS or 
SS interval, to delay the next shock. The response 
requirement was increased from two up to eight. Re- 
sponse rates increased concomitantly over those ob- 
tained when single responses could delay shock. Then, 
holding the ratio constant at eight, Verhave varied 
the RS interval, and found a functional relation simi- 
lar to those obtained in experiments by Sidman (1953) 
and by Clark and Hull (1966) where every response 
could delay shock (see Figure 6 above). Of course the 
absolute response rates were substantially higher than 
in these studies with a ratio of one. 

Sidman (1966) reported a shock-delay experiment 
that he called fixed-interval avoidance. ‘In the absence 
of responding, the procedure resembled the t4 — t? 
procedure of Hurwitz and Millenson already de- 
scribed. A fixed interval when responses could not 
affect shock (t4) was followed by a fixed period in 
which responding could affect an impending shock. 
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Delivery of the shock started this cycle over. On Sid- 
man’s procedure responding could change the cycle 
length. Each response in the “effective” period (cor- 
responding to t? on Hurwitz and Millenson’s pro- 
cedure) delayed the shock and extended that period; 
response-shock intervals were in effect until respond- 
ing lapsed and shock occurred, reinstating the “td” 
period. Using a t4 of 60 sec and a response-shock inter- 
val of 6 sec, Sidman observed an acceleration of re- 
sponse rate within the t4 period. This pattern, which 
is characteristic of fixed-interval schedules was also 
observed in the experiment by Kadden et al. (1974), 
described above. The pattern indicates a degree of 
discrimination of nonreinforcement during the t4 pe- 
riod. Hurwitz and Millenson did not find systematic 
changes in response rate within the t4 periods. This 
difference of results is attributable to the fact that on 
Sidman’s procedure, each t4 period began with a 
shock, providing a discriminable cue that was not 
consistently present on Hurwitz and Millenson’s pro- 
cedure. ‘The experiment by Kadden et al. confirmed 
that cuing function of shock. 

Sidman’s fixed-interval procedure and the t4 — tP 
procedure of Kadden et al. are not entirely com- 
parable to fixed-interval positive reinforcement pro- 
cedures, for the periods when responses cannot affect 
shock are also periods when no shocks can occur. The 
basis for reinforcement is missing, along with the ac- 
cess to reinforcement. In positive reinforcement pro- 
cedures the basis for reinforcement is continually 
present. Hence, the negative reinforcement procedures 
would have been more comparable to positive rein- 
forcement schedules if shocks were delivered during 
the t4 periods. Similar issues arise in defining the ex- 
tinction of negatively reinforced behavior. 


Extinction After Negative Reinforcement 


Extinction is the discontinuation of reinforcement, 
with continued opportunity to respond. One expects 
that responding will return to levels that occurred 
prior to conditioning. By convention, a basic or “refer- 
ence” extinction procedure implies a situation in 
which prior conditioning occurred, unchanged ex- 
cept for the withholding of reinforcement. Depending 
on its prior scheduling, the absence of reinforcement 
may or may not produce a situation strikingly differ- 
ent from that during prior conditioning. While this 
degree of difference most likely will affect how quickly 
the process of extinction occurs, the degree of differ- 
ence is irrelevant to the definition of an extinction 
procedure or process. Extinction, whether quick or 
slow, is still extinction. 
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The experiments described below will illustrate the 
following points regarding extinction after negative 
reinforcement: Iwo different types of extinction pro- 
cedures have been proposed. ‘The more traditional 
one simply involves the discontinuation of shock. 
However, it has been recently argued that this re- 
moves reinforcement only indirectly, through suspen- 
sion of the drive operation upon which reinforcement 
was based. By this view, discontinuing all shocks is 
analogous to providing food continuously during ex- 
tinction of food-reinforced responding, An alternative 
type of extinction procedure involves continued pres» 
entation of shocks while eliminating the ellects of 
responses on shock. Proponents of this as a reference 
procédure argue that discriminabaility of the extinc- 
tion situation is an issuc in defining the cxtinction of 
negative reinforcement, especially if ane focuses 6n 
the centingent relation between responding and rein- 
forcement rather than merely the delivery or nonde- 
livery of reinforcement. Finally, the fact that delivery 
of free or noncontingent shocks may induce respond- 
ing in a way inconsistent with usual notions of nepa- 
tive reinforcement, further complicates the definitions 
and interpretations of extinction of negatively rein- 
foercéd behavior. 


ALTERNATIVE PROGEDURES 
FOR EXTINCTION 


To extinguish responses that were conditioned with 
shock-delay procedures in which responding could re- 
duce the shock frequency to zere, some experimenters 
have simply deactivated the shock generator. Thus, for 
example, Shnidman (1968) eliminated all shocks after 
rats had been trained with 9 few four-hour sessions of 
shock-deélay with SS = 5 sec and with RS = 20 or 40 
sec, each rat being exposed to beth values, The shock 
gencrator was deactivated three hours after the begin- 
ning of a session that continued beyond its usual 
4-hour limit, if necessary until response rate declined to 
zero tor 15 min. Response rates dropped to zero within 
an hour for two of three rats, and within two hours 
for the third. Using a similar extinction procedure, 
Boren and Sidman (1957) found greater persistence of 
responding. After first subjecting their rats to 100 or 
more hours of initial training with SS and RS inter- 
vals of 20 sec, they alternated periods of conditioning 
and extinction. The 6-hour sessions were divided into 
two pericds, with normal conditioning for the first 
part of each session, and with the shock generator off 
during the second part of each session. Responding 
during extinction declined in an orderly fashion over 
a few sessions, but in most cases did not reach zero. 
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Some arimals’ responding also declined during the 
conditioning periods, indicating overall extinction, 
but other animals continued to respond during the 
conditioning periods. In these latter cases the periods 
when the shock generator was off clearly were dis- 
criminated from the periods when it was on, even 
though no added exteroceptive stimuli were supplied. 
Boren and Sidman suggested the occurrence of un- 
shocked nonavoidance could account for their results, 
but the experiment does not permit one to distinguish 
this from a discrimination based directly on the 
shocks themselves. 

The elimination of all shock produces confounded 
changes in two of the three basic features I have used 
for analyzing negative reinforcement procedures. The 
aversive situation is removed, and access to reinforce- 
mént 16 therefore modified indirectly. This is espe- 
cially obvious in relation to escape procedures, where 
conditioning is based on removal of continuous shock. 
Reinforcement of a particular response is impossible 
if the reinforcer (absence of shock) is continuously 
present. In principle, the confounding is no different 
for procedures based on intermittent shocks, although 
special procedures are required to determine whether 
a réduction in responding reflects discrimination of 
the absence of shock or of the absence of contingent 
relations between responses and shock. 

Davenport, Coger, and Spector (1970), following a 
related paper by Davenport and Olson (1968), stated 
the implications of eliminating all shocks. They ar- 
pued that removal of all shocks is either suspension of 
the drive operation, or else it constitutes reinforce- 
ment of all behavior since whatever the organism does 
is followed by shock omission, a consequence that dur- 
ing prior conditioning was restricted to the avoidance 
response. ‘To provide an alternative extinction pro- 
cedure Davenport, Coger, and Spector trained rats 
with 10 hours of exposure to a standard shock-delay 
procedure (SS = 15, RS = 15). Then they delivered a 
shock every 15 see irrespective of behavior, start- 
ing this extinction procedure an hour after a session 
had begun. Four of the five animals virtually ceased 
responding within 90 min of the noncontingent shock 
procedure: the fifth had responded primarily immed1- 
ately after shocks during conditioning (reducing its 
shock frequency by only 6 percent), and simply con- 
tinued in this pattern of post-shock responding. ‘Thus, 
extinction appeared to be achieved with suspension of 
the negative reinforcement contingency while main- 
taining the basis for negative reinforcement. In prin- 
ciple, reinforcement could have been delivered at any 
moment. 

This procedure did not deal with the implications 
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of subjects avoiding more or less well during condi- 
tioning. Some people, having taken this into account, 
focus the definition of extinction on the contingency 
between response and reinforcement, rather than sim- 
ply on the discontinuation of reinforcement (analo- 
gous considerations also apply to positive reinforce- 
ment.) ‘To the extent that in the animal’s history 
shocks have set the occasion for reinforcement, the 
elimination of shocks during extinction is as much 
a change of cues as it is a discontinuation of the 
relation between responding and reinforcement. On 
the other hand, to the extent that the subject’s prior 
performance has completely eliminated shocks during 
training, the reintroduction of maximal shock fre- 
quency during extinction produces a situation in 
extinction that little resembles recent sessions of 
conditioning. Coulson, Coulson, and Gardner (1970) 
pointed this out, suggesting that in this context a 
proper extinction procedure would be delivery of a 
pattern of shocks similar to that observed during the 
training sessions that just preceded extinction. They 
carried this out after pretraining with a shock-delay 
procedure (SS = 5, RS = 30), They recorded the exact 
sequence of shocks received by each animal during its 
final conditioning session and subsequently delivered 
the same pattern to produce an extinction procedure 
based on noncontingent shocks, which they compared 
to extinction with all shocks deleted. In the latter 
condition, responding dropped quickly to zero; with 
the “matched shock” extinction procedure responding 
declined over some 6 to 15 sessions, but typically not 
to Zero. 

Although they reported a somewhat slower decline 
in response rate for extinction with noncontingent 
shocks matching the shocks of avoidance conditioning, 
the results of Coulson, Coulson, and Gardner (1970) 
are not directly comparable to those obtained by 
Davenport, Coger, and Spector (1970). The two stud- 
ies used different conditioning parameters, and differ- 
ent amounts of training prior to extinction. A more 
direct comparison of these procedures has been 
achieved in my laboratory by G. D. Smith (Smith, 
1973). He first trained rats with a shock-delay pro- 
cedure (SS = 5, RS = 20) until they met a fairly 
rigorous performance criterion: over 75% of sched- 
uled shocks avoided, and stable responding assessed 
over a 2-week period. All animals were then exposed 
to three different procedures that eliminated negative 
reinforcement: shock omission; delivery of noncontin- 
gent shock every five seconds, which matched the 
shock-shock interval of the previous training program; 
and a pattern of shocks that matched the given an- 
imal’s final conditioning session. Exposures to the 
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various “extinction”? procedures lasted five 100-min 
sessions each, and retraining to the same stability cri- 
teria was carried out after each exposure to nonrein- 
forcement. All animals were exposed to the shock- 
omission procedures before exposure to the other two 
procedures. Then after retraining, some rats were 
given the “maximum shock frequency” of noncontin- 
gent shock, while others were exposed to the ““matched- 
shock” procedure. After retraining, each rat was then 
exposed to the other noncontingent shock procedure, 
and then retrained and reversed again. Figure 11 
Shows performance of a representative animal from 
each of the two sequences of procedures. 

Even with prior avoidance training to a stringent 
criterion, turning off the shock generator produced 
precipitous declines in responding. In contrast, re- 
sponding persisted throughout five-day exposures to 
extinction procedures based on noncontingent deliv- 
ery of shocks. On the matched-shock procedure all 
animals showed a progressive decline in responding 
over the five-session exposure, indicating an extinc- 
tion process and suggesting that more catended ex- 
posure to this procedure would have resulted in low 
levels if not complete cessation of responding. The 
results of noncontingent shock delivered according to 
the fixed shock-shock interval were more variable, usu- 
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Fig. 11. Responses and shocks per minute as a function of 
sessions for a representative rat from each of two procedural 
sequences. Responses and shocks are indicated by separate 
symbols as noted in the Figure. Each data point represents the 
mean rate for a single session. Transitions from one experi- 
mental procedure to another are indicated by dashed lines 
drawn through individual plots. The order in which individual 
rats were exposed to extinction procedures is also indicated in 
the individual graphs. (From G. D. Smith, 1973.) 
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ally showing a persistent high rate of responding, but 
occasionally showing low consistent rates. Recondi- 
tioning, while accomplished after exposure to each 
extinction procedure, was most disrupted after the 
matched-shock procedure, but virtually immediate af- 
ter the shock-omission procedure. Stressing these lat- 
ter results, Smith (1973) argued for match-shock as 
the “reference” extinction procedure, and agreed with 
Davenport et al. (1970) and Coulson et al. (1970) that 
the traditional procedure of omitting all shocks con- 
stitutes a suspension of the drive operation, and is 
only indirectly a removal of the negative reinforce- 
ment contingency. 

The resulting treatment of extinction still has to 
deal with the fact that the responding of different sub- 
jects is extinguished in the presence of different abso- 
lute shock rates corresponding to, and thus confounded 
with the effectiveness of prior conditioning. 


Errects or NoncONTINGENT Suocks 


While the reasons for using noncontingent shock 
as the basis for extinction are quite compelling, the 
persistence of responding on such extinction pro- 
cedures complicates the analysis of behavior in an- 
imals with histories of negative reinforcement based 
on shock. Such persistence is well illustrated in rats 
trainéd on thé shock-frequeéncy reduction procedure 
based on randomly spaced shocks, that I have already 
described (Herrnstein & Hineline, 1966). While ex: 
tinction was indeed achieved by delivering noncon- 
tingent shocks, it ofteti took many sessions. For ex- 
ample, one animal emitted some 20,000 responses ata 
slowly-decreasing rate over some 12,000 min accu- 
mulated in 100-min sessions. In light of this, it is diffi- 
cult to know whether experiments presented as 
“maintenance of responding by noncontingent shocks”’ 
aré indeéd what the description imphies, or whether 
the effects so reported are merely modulations of 
responding that is undergoing extinction. For exam- 
ple, Hurwitz, Roberts, and Greenway (1972) drew 
conclusions regarding the response-maintaining effects 
of response-independent shocks on the basis of only 
one hour on the noncontingent shock procedure, 
after more than 200 hours of preliminary training. 

Other studies have been carried out further, with 
a variety of interesting results. For example, Powell 
and Peck (1969) found that after training with shock- 
intensity reduction, noncontingent shocks maintained 
subsequent responding even when it was varied over a 
fairly wide range of intensities and frequencies. Powell 
(1972) examined effects of noncontingent shocks on 
responding after more conventional Sidman_ shock- 
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delay procedures, and found that periodic noncontin- 
gent shock produced responding in 8 of 13 rats, some 
persisting for as long as 180 hours of exposure to a 
given procedure. Aperiodic shocks produced some- 
what less persistent responding, a result consistent 
with observations by Kadden (1973) using rhesus mon- 
keys. According to Powell, the temporal relations of 
responses to shock in his study indicated that much of 
the responding was shock-elicited, presumably reflect- 
ing attacks on the lever. However, even though it has 
a reflex-like time relation to the noncontingent shock, 
this responding appears to be related to the prior 
avoidance conditioning. When shock-delay training 
on a second lever was interposed before the noncon- 
tingent shock procedure was begun, the responding 
that persisted during the subsequent noncontingent 
shock procedure occurred almost entirely on the lever 
first used for shock reduction. This suggests that the 
shock was functioning beth as a discriminative and as 
an eliciting stimulus. 

An analysis by Hake and Campbell (1972) of re- 
sponding produced by noncontingent shocks compli- 
cates the picture still further. Using a fixed-interval 
escape procedure with squirrel monkeys, they ob- 
served two patterns of lever-press responding. One 
pattern, post-shock responding, suggested elicitation; 
the other showed characteristics of maintenance by 
the fixed-interyal schedule. When they changed the 
situation, making two operanda available—a key 
whose pressing was reinforced on the FI schedule and 
a hose conveniently available for biting—they found 
that the post-shock responding was confined to the 
bite hose, while the FI responding occurred on the 
key. So far, this appears to be a nice test of Powell’s 
(1972) notion that the post-shock responding differed 
functionally from other responding, being shock- 
élicited rather than an unextinguished responding 
based on negative reinforcement. However, when Hake 
and Campbell made the bite hose unavailable, post- 
shock responding was then observed on the key. 
Clearly, this was not a biting response; they still clas- 
sified it as shock-elicited aggressive responding, but 
displaced to whatever manipulandum was available. 

Finally, McKearney has shown that in animals with 
negative reinforcement histories one can readily main- 
tain responding even with response-produced shocks 
provided that the probability of a response producing 
shock is not too high. High-probability response-pro- 
duced shock suppresses responding (McKearney, 1972; 
Powell, 1972). Thus, clearly, the effects of shock de- 
livered to animals with histories of negative reinforce- 
ment demand further analysis and will receive further 
experimentation (cf. chapter 7 of this volume by 
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Morse & Kelleher). For present purposes, it suffices to 
note that long after it has been discontinued, shock 
reduction can affect an animal’s responses to shock. 
An experimenter purporting to demonstrate negative 
reinforcing effects in animals with histories of prior 
negative reinforcement must acknowledge the possible 
effects of shock delivery, which result from prior as 
well as current response-contingent relationships. 


NEGATIVE REINFORCEMENT 
WITH ADDED CUES 


So far, I have focused on procedures where shock 
is the only exteroceptive stimulus manipulated within 
conditioning sessions, In some of these procedures the 
addition of other cues would be superfluous, for the 
shock itself denotes the opportunity to respond and 
the availability of reinforcement, as well as producing 
the aversive situation. However, if shocks are brief 
and intermittent, if not all responses in the presence 
of shock are effective, or if responding in the absence 
of shock can affect subsequent shock delivery, then 
added stimuli such as tones or lights become of inter- 
est. The germinal idea for the present discussion of 
added stimuli comes from Keehn (1966), from Sidman 
(1966), and from Herrnstein (1969). They have sug- 
gested or argued that such added stimuli should be 
considered as discriminative stimuli denoting the 
availability of negative reinforcement. In addition, 
Baum (1973a) has provided a key concept, identifying 
situation transitions as reinforcements. I attempt to go 
beyond these in distinguishing several interrelated 
functions of added cues in negative reinforcement situa- 
tions. I shall start with multiple schedule experiments 
where cues correlate with whole conditioning pro- 
cedures, including aversive situations, accessibility of 
reinforcement, and periods of shock reduction pro- 
duced on a given procedure. Later will come cues de- 
noting more limited aspects of negative reinforcement 
procedures, such as particular shock-presentation 
schedules, opportunity to respond, and limited peri- 
ods when responding can affect shock. 


Added Cues Denoting Multiple Contingencies 


A simple multiple schedule can be achieved by 
correlating a reinforcement procedure with one stim- 
ulus, correlating extinction with another, and alter- 
nating the two stimuli. This was accomplished by 
Bersh and Lambert (1975). ‘They initially conditioned 
lever-press responding in rats, with 32 sessions of ex- 
posure to a procedure of the kind devised by Herrn- 
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stein and Hineline (1966), which I have described 
previously. In the absence of responding, randomly 
spaced shocks were delivered at a mean inter-shock 
time of 5.7 sec. A response resulted in another random 
distribution of shocks, still giving a fairly constant 
moment-to-moment probability of shock, but with an 
average of 20 sec until the next shock. With delivery 
of this shock, the higher shock frequency was rein- 
stated. ‘hese 100-min sessions of initial training were 
accomplished in a darkened chamber. Next a light was 
provided as a discriminative stimulus (SP), accom- 
panying the same procedure as before. This light was 
present for four-min periods, alternating with periods 
of darkness during which shocks were delivered ac- 
cording to the high-frequency shock schedule, inde- 
pendently of any responding (Darkness = $4). The 
duration of the S* period was determined by a delay 
contingency; a return to light and its correlated access 
to reinforcement could not occur until 20 see had 
elapsed without a response. Later in training this 
delay requirement was extended to 40 and then to 60 
sec for some animals. Finally, two of the rats were 
placed on an extinction procedure in which the high 
Shock frequency occurred independently of respond- 
ing, and whether or not the light was on. 

Figure 12 shows a representative performance on 
this procedure. Response ratés im darkness came un- 
der control of nonreinforcement, dropping far below 
those in the presence of the light. These reduced 
rates developed systematically, and often fairly 
quickly, even though the shock frequency in darkness 
was substantially higher than that in light, and dark- 
ness had accompanied the initial conditioning. Dur- 
ing extinction of the discriminative responding, with 
noncontingent shocks delivered on the average of 
every 5.7 sec regardless of responding or stimulus con- 
dition, the response rates slowly moved together as 
responding decreased in the presence of the light. The 
plots for initial conditioning and for the discrimina- 
tion are fairly representative for all animals. However, 
the data for the final extinction procedure is not rep- 
resentative, for the other animal exposed to this pro- 
cedure persisted with high response rates in the pres- 
ence of the light, and low response rates in its absence. 
The persistence of the discrimination under these 
circumstances is somewhat surprising, given the selec- 
tive extinction that produced the discrimination in 
the first place. It is less surprising in comparison with 
another experiment that used noncontingent shocks 
after histories of negative reinforcement. Appel (1960) 
placed rats on a discrimination procedure similar to 
that of Bersh and Lambert but with negative rein- 
forcement provided by a shock-delay procedure. His 
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animals failed to show discriminative control when 
the extinction component included delivery of non- 
contingent shocks. Systematic comparisons would be 
needed to identify critical features for cstablishing 
this type of discrimination. 

The multiple schedules studied by Bersh and Lam- 
bert and that studied by Appel involved reinforce- 
ment in one component and nonreinforcement with 
delivery of noncontingent shocks in the other. De Vil- 
liers (1972, 1974) has studied multiple schedules with 
sheck deletion procedures based on different shock 
frequencies in the different components. In both of 
de Villiers’s experiments the basic procedures were 
the same as in his experiment described above. In the 
absence of responding, shocks were delivered at ir- 
regular intervals, with a roughly constant probability 
of shock from moment-to-moment. He specified these 
in terms of the mean interval between shocks. The 
first response within any scheduled shock-shock inter- 
val cancelled the shock due at the end of that inter- 
val; additional responses within that interval had no 
effect. In his earlier experiment, de Villiers (1972) 
demonstrated contrast effects in addition to basic mul- 
tiple schedule effects. He first provided identical 
shock-deletion schedules in the presence and absence 
of a buzzer that alternated, on for three min and off 
for three min. Then the schedule was changed in one 
component, providing for greater or less shock- 
frequency reduction by increasing or decreasing the 
maximum shock frequency. ‘The response rate changed 
in that component, in a direction directly related to 
the change in shock-frequency reduction. Thus, if 
shock frequency increased, so did responding (and the 
amount of shock reduction). The contrast effect was 


NEGATIVE REINFORCEMENT AND AVOIDANCE 


EXTINCTION 


o— (LIGHT) 
o---0 (DARK ) 


Fig. 12. Responses per minute 
for Rat 3 during initial condi- 
tioning, with response-contin- 
gent shock-reduction in dark- 
ness; discrimination training, 
with response-contingent shock- 
Pa reduction in light and noncon- 
ee tingent shock in darkness; and 
extinction, with noncontingent 
shock in both light and dark- 
ness. Data points connected 
with broken lines indicate re- 
: sponding in darkness; data 
points connected with solid lines 
indicate responding in light. 
(After Bersh & Lambert, 1975. © 
1975 by the Society for the Ex- 
perimental Analysis of Behavior, 
Inc.) 
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revealed by response rate changes in the opposite 
direction in the presence of the alternate stimulus 
where the schedule had remained unchanged. De Vil- 
liers showed that these contrast effects could be sum- 
marized by an equation that Herrnstein (1970) had 
developed to deal with behavioral contrast in multi- 
ple schedules of positive reinforcement. 

Extending the study of stimulus control, de Villiers 
(1974) examined responding on similar multiple 
schedules, but used separate levers and cue lights for 
the different component schedules. This permitted 
further examination of shock-frequency reduction as 
a controlling variable. He also varied the frequency 
with which the components of the multiple schedule 
alternated within a session. This latter manipulation 
permitted a detailed comparison with multiple sched- 
ules of positive reinforcement. With positive rein- 
forcement it has been found (e.g., Todorov, 1972) that 
rapid alternation between components of a multiple 
schedule produces relative response rates that are 
nearly proportional to their relative rates of rein- 
forcement. Also with positive reinforcement, less fre- 
quent alternation between components produces 
smaller differences between response rates on the 
different components. De Villiers (1974) found very 
similar relationships for multiple schedules of nega- 
tive reinforcement, with amount of shock-frequency 
reduction corresponding to rate of reinforcement. 
This is shown in Figure 13, where different panels 
(bounded by vertical lines) correspond to different 
pairs of component schedules in the multiple sched- 
ule. ‘The horizontal line in each panel of the figure 
indicates the relative shock-frequency reduction pro- 
duced on the right lever, averaged over exposures to 
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Fig. 13. Mean relative response rates of each rat for the last 
six sessions of each experimental condition. The mean relative 
shock-frequency reduction for each multiple schedule is shown 
by the horizontal solid lines. The data points of differing shapes 
designate performances of different rats, (From de Villiers, 
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a given pair of values. Shock-frequency reduction was 
computed by subtracting the obtained shock fre- 
quency from the frequency that would have occurred 
in the absence of responding. For each schedule com- 
ponent, this measure of “‘r” was substituted into the 
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rates on the right and left components, respectively. 
When the component schedules were alternated every 
40 sec, the relative response rates closely matched the 
relative amounts of shock-frequency reduction. When 
the component schedules were alternated more slowly, 
response rates were more nearly equal in the two 
components, giving relative values closer to 0.5. This 
effect was similar to one seen in pigeons run on simi- 
lar schedules of positive reinforcement, as noted above. 

I noted earlier that de Villiers found, when apply- 
ing Herrnstein’s (1970) relative reinforcement fre- 
quency equations to responding on single variable- 
interval schedules, that good fits to the matching 
relation were obtained when reinforcement rate was 
defined as degree of shock-frequency reduction. Ob- 
tained shock frequency also provided a fair approx- 
imation to the data (as it should, being correlated 
with magnitude of shock frequency reduction), but 
shock-frequency reduction provided a better fit in 
every case. On the multiple schedules considered here, 
relative shock-frequency reduction was more simply 
related to relative response rate than was received 
shock rate. I also noted earlier that since de Villiers’s 
shock-frequency reduction procedures allowed re- 
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sponding to reduce shock frequency to zero, he could 
manipulate shock frequency reduction only by vary- 
ing the “background” or maximum shock frequency, 
that which would occur only in the absence of re- 
sponding. A more powerful test of shock-frequency 
reduction would be to hold background shock fre- 
quency constant and manipulate the minimum shock 
frequency produceable by responding, including other 
values as well as zero. 

Wertheim (1965) also studied multiple schedules 
of negative reinforcement, but used shock-delay pro- 
cedures. In addition to demonstrating potent stimulus 
control of responding, and contrast effects, he found 
that the relative response rate was correlated quite 
strongly with relative frequencies of shock delivery. 
However, the use of his procedures for a comparison 
of shock frequency with shock-frequency reduction is 
problematical since he used shock-shock intervals of 
zero, giving continuous shock in the absence of re- 


sponding. 


Added Cues and the Escape Paradigm 


My discussion of negative reinforcement procedures 
began with the escape paradigm, and evolved away 
from it. A response in the presence of continuous 
shock removes that shock, producing a shock-free in- 
terval, Negative reinforcement is clearly the offset of 
the shock; specifying the shock-free interval is no 
problem. However, there is a continuum between 
continuous shock and sequences of intermittent brief 
shocks. ‘The escape paradigm providing for response- 
produced shock-free intervals is effective across much 
of this continuum, but it is not clear how to specify 
the effective consequence when the duration of the 
interval between brief shocks approaches that of the 
response-produced shock-free interval. The analysis 
confronts a “figure-ground”’ problem: A response can 
be considered as deleting or delaying a specific shock, 
or the situation can be treated as one and continuous, 
with responses reducing the overall aversiveness of the 
situation. The experimental procedure may specify 
shock-deletion or shock-delay, but the effective vari- 
able may well be shock-frequency reduction. 

When cues are used in multiple schedules such as 
those just considered, an added stimulus is correlated 
with a whole procedure. It accompanies not only a 
schedule of shock presentation and the contingency or 
schedule of reinforcement whereby responding can 
affect the shock presentation, but also the periods of 
shock reduction that responding produces. ‘The cue 
does not simplify interpretation of the procedure and 
its effects; the cue merely delimits the periods during 
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which the procedure operates. If, however, the cue 
correlates only with particular features of a condi- 
tioning procedure, it does enter into the interpreta- 
tion of the procedure’s effects. 

The first examples to be considered here are experi- 
ments in which the cue correlates perfectly with the 
shock-presentation schedule. Onset and offset of the 
cue provide distinct boundaries to a sequence of 
intermittent shocks, permitting interpretation to rely 
again on the escape paradigm of negative reinforce- 
ment. Dinsmoor and Bonbright (reported in Dins- 
moor, 1967) directly compared such a procedure to a 
standard escape procedure that used continuous shock. 
They used a multiple schedule, consisting of white 
noise accompanied by intermittent brief shocks, alter- 
nating with continuous shock of lower intensity in 
the absence of the white noise. In both situations, the 
same variable-interval schedule was in effect, with the 
removal of the continuous shock, or of noise plus 
intermittent shock, contingent upon lever-pressing. Ry 
independently adjusting the shock intensities in the 
two componcnts of the multiple schedule, they pro- 
duce virtually indistinguishable patterns of lever- 
pressing in the two components. These performances 
were identically affected by administration of chlor- 
promazine at various doses. Thus, in this situation, 
bchavior that removed continuous shock was equiva- 
lent to that which removed intermittent shocks and 
correlated white noise. This should come as no great 
surprise after the results of Kelleher and Morse (1964) 
who compared the effects of drug administrations on 
closely similar food-maintained and shock-maintained 
lever-press performances. As described at the begin- 
ning of this chapter, they found that drug effects were 
determined more by the specific patterns of behavior 
than by the types of reinforcement maintaining those 
patterns. 

Procedures using cues correlated with intermittent 
shocks have often been characterized as “reinforce- 
mént by removal of a stimulus paired with shock,” 
(¢.g,, Azrin, Holz, Hake, and Ayllon, 1963; Dinsmoor, 
1967; Kelleher and Morse, 1964), which of course de- 
scribes the immediate response consequence. However, 
this characterization implies that on these procedures, 
the effects of shock on responding are entirely me- 
diated by the correlated stimulus. One must still 
consider possible direct effects that the intermittent 
shocks and their deletion may have on behavior, inde- 
pendent of the correlated stimulus. Morse and Kel- 
leher acknowledge this in their more recent descrip- 
tion of these procedures by characterizing them as 
reinforcement by “termination of schedule com- 
plexes” (Morse & Kelleher, 1966), the schedule com- 
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plex being the combination of intermittent shocks 
and correlated stimulus. They found evidence for 
direct effects of shocks. When developing fixed-inter- 
val schedules of negative reinforcement, they found 
that when intermittent shocks were delivered through- 
out the interreinforcement interval the positive-accel- 
erating rates commonly identified with this schedule 
(“fixed-interval scallop’) failed to appear. They ob- 
tained the characteristic fixed-interval performance 
only by allowing the interval to elapse before the 
intermittent shock began, as described above in the 
procedure for Figure 3. It is likely that the direct 
effects of intermittent shocks delivered during the 
interval are akin to the effects of noncontingent shocks 
delivered during extinction after conditioning with 
shock-delay procedures (e.g., McKearney, 1969). 
Dinsmoor (1962) systematically examined some 
direct cflects of shocks and shock-deletion, correlated 
with a cue and cue-removal. He compared rats’ per- 
formances on two procedures, both of which could be 
called VI 30-sec schedules of negative reinforcement. 
Both schedules of reinforcement were superimposed 
on identical schedules of intermittent shock. specified 
by their mean shock-shock intervals. In both proce- 
dures responding could produce fixed shock-free 
periods during which any scheduled shocks were 
deleted. Responses during these shock-free periods had 
no ellect.2 In one procedure an added light and tone 
denoted the presence of the schedule of intermittent 
shocks and the shock-free periods, respectively. In the 
other procedure, no added cues were provided. Initial 
training was accomplished with fixed shock-deletion 
periods of 60 sec, and with mean shock-shock intervals 
varying from day to day, between 7.5 and 120 sec. The 
duration of shock-free periods was also systematically 
manipulated over sessions, between 15 and 240 sec for 
one animal, and between 30 and 240 sec for the other 
two. Thus, in some sessions a reinforced response 
could produce a shock-free period that exceeded the 
average shock-shock interval; in other sessions it could 
not. In some sessions the correlated lights and tones 
were used. In other sessions there were no exterocep- 
tive cues other than the shocks themselves. In all ses- 
sions a variable interval schedule provided a mean 
interreinforcement time of 30 sec if the animal re- 
sponded sufficiently often in the presence of the shock 
schedule. ‘This time was computed by subtracting the 
reinforcement (shock-free) periods from the total time. 


2 This is similar to what I have previously called a shock- 
deletion procedure. However, since the duration of the 
shock-deletion period is based on time, independent of the 
shock-delivery schedule, I will refer to this consequence of re- 
sponding simply as a shock-free period. 
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As indicated by Figure 14, response rates were 
much higher with added cues. However, there was 
also substantial responding when the cues were not 
provided, especially when the mean shock-shock inter- 
val was less than 30 sec. Dinsmoor found that sheck- 
free periods with durations of 30 sec or less tended to 
be less effective than longer values, except when the 
average shock-shock interval was extremely short. 
Above 30 sec, the duration of the shock-frec period 
had little systematic effect whether or not the corre: 
lated cues were supplied. Relative response rates in 
the presence and absénce of the shock schedule indi- 
cated discriminative control by the added cues. When 
performances were stable, fewer than 1007, of the re- 
sponses typically occurred during the shock-free pe- 
riods, while without the cues these percentages were 
much higher. Thus considered in total, Dinsmoor's 
experiment indicates both discriminative control by 
the correlated cues and potent reinforcing effects of 
the immediate response-contingent stimulus change 
that these cues provide, but it also indicates the likeli- 
hood of some behavioral control by the shocks, inde- 
pendent of the cues. 

Stimuli correlated with intermittent shocks en- 
hance the flexibility with which negative reinforce- 
ment can be scheduled. This was demonstrated in the 
experiment by Kelleher and Morse (1964), described 
early in the present chapter, showing characteristic 
fixed-interval and fixed-ratio schedule effects in squir- 
rel monkeys. In elaborating on the techniques used 
for development of such performances, Morse and 
Kelleher (1966) also summarized a number of charac- 
teristics that behavior on these procedures has in com- 
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Fig. 14. Families of curves show- 
ing the mean rate of bar press- 
ing by each rat as a function 
of the mean interval between 
successive shocks, The values 
identifying each curve represent 
the number of seconds for 
which shocks were deleted by 
an effective response. A, BR, and 
G@ denete individual rats, (From 
DBinsmoor, 1969. ©) 1969 by the 
society for the Experimental 
Analysis of Behayier, Ins.) 
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mon with behavior on comparable positive reinforce: 
ment schedules. 

Avrin, Holz, Hake, and Allyon (1963) provided 
another striking demonstration of such effects, with 
ratio schedules. First, they exposed squirrel monkeys 
to four 6-hour preliminary sessions consisting of 5:min 
periods of intermittent shock (with the mean S§ inter 
val increased over SCSS10ONS, from 15 S¢C to 9 min) ¢€gr- 
related with bright light and silence, alternated with 
Semin shock-free periods accompanied by tone and 
dimmed lights. They noted that this prétraimiig is 
similar te magazine training sessions In experiments 
using positive reinforcement. subsequently, they made 
$0-séc shock-free periods, with, correlaicd stimuli, con- 
tingent on responding. Responding during shock-free 
periods was ineffective. The shock-delivery schedule 
continued to be accompanied by the bright light and 
silence. Moving from continuoue reinforcement, they 
found éxtremely stable and persistent Gxed-ratio per 
formances with ratios of 25 and 50, and maintained 
substantial, though less stable responding with ratios 
as high as FR 300. 

Azrin et al. then held the ratio requirement at 50, 
and varied the shock-delivery schedule and the dura- 
tion of response-produced shock-free periods. ‘They 
found that under some conditions the duration of the 
cued shock-free period is less critical a variable than 
Dinsmoor’s (1962) results indicated. Over hundreds of 
hours of conditioning, changes in relative duration of 
the shock-free periods produced by responding had 
very little effect on performance. For example, with a 
mean shock-shock interval of ten minutes, and the 
duration of the cued shock-free period reduced to ten 
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seconds, only a slight reduction in overall response 
rate was observed. Responding was also maintained, 
although at a lower overall rate, when the shock-free 
periods were as brief as two seconds. At the same time, 
the absolute shock rate correlated with the bright 
light and silence seemed not to be critical either, pro- 
vided that the shock frequencies were changed grad- 
ually. For example, when every 25th response (FR 25) 
produced a 30-sec timeout, and the shock-shock inter- 
val was increased gradually by giving 18 hours or 
more of exposure to each value, response rates re- 
mained virtually unchanged with the mean shock- 
shock interval approaching one hour. To be sure, it 
must be emphasized that these are maintenance data, 
resulting from very gradual changes in experimental 
conditions. More sudden changes in shock frequency 
er timeout duration did result in performance decre- 
ments. Also, these results were obtained with monkeys, 
Wherass MLE évpériments on negative reinforcements 
have used rats. 

These experiments by Azrin et al. suggest that a 
eritical and potent feature of the cucd escape 
paradigm is tlic stimulus change occurring at the onset 
Of a shock-free period. Frequencies and patterns of 
responding reycaled schedule control by this event. 
with only minor or limiting effects attributable to 
overall shock frequency, shock-frequency reduction, or 
duration of the stimulus-correlated shock-free periods. 
This tentative conclusion gains further support from 
an experiment by W. M. Baum (1973b) who studied 
pigeons’ time allocation in shock-correlated stimulus 
situations, as influenced by how frequently cued 
shock-free periods were initiated in those situations. 
The experiment is also of interest because it demon- 
strates another quantitative similarity between posi- 
tive and negative reinforcement. It can also be related 
to the functions obtained by de Villiers (1972, 1974) 
described earlier. 

Baum’s experiment used an experimental chamber 
whose floor was made of two platforms. If the pigeon 
stood on one platform, a red light was on: if the 
pigeon stood on the other, a green light was on; stand- 
ing on both platforms simultaneously was accom- 
panied by a white light, and a move from one plat- 
form to the other produced the white light for one 
second. All of these lights were accompanied by brief 
shocks delivered once per sec. When either the red or 
the green light was on, as determined by the bird’s 
position, two variable-interval programming timers 
were running which could initiate two-minute shock- 
free periods, with all lights off. The two variable- 
interval timers typically had differing mean intervals, 
and each could initiate a shock-free period in the 
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presence of only one color. When the scheduled inter- 
val on one timer elapsed in the presence of the color 
associated with the other timer, the first timer’s shock- 
free interval was not delivered until immediately after 
the color changed. These procedural features closely 
resemble those of an experiment by Baum and Rach- 
lin (1969) using the same chamber, with concurrent 
schedules of food delivery instead of schedules of 
shock deletion. In that experiment, Baum and Rach- 
lin had found that the ratio of times spent on two 
sides of the chamber was proportional to the ratio of 
the corresponding frequencies of positive reinforce- 
ment. Formally, this matching relation is virtually 
identical to the repeatedly-verified matching relation 
describing relative response rates (e.¢. key-pecks per 
minute) as a function of relative frequencies of posi- 
tive reinforcement (Herrnstein, 1970). 

Figure 15 shows the results obtained on Baum’s 
procedure, including individual data points for each 
of four pigeons, showing relative time allocations as a 
function of relative frequencies with which cued 
shock-free periods were initiated in the presence of 
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Fig. 15. The logarithm of the ratio of time spent on the left 
to time spent on the right as a function of the logarithm of 
the ratio of number of shock-free periods obtained on the left 
to number of shock-free periods obtained on the right. The 
data from four birds are presented; each data point represents 
a value obtained from an individual bird. The solid line was 
fitted by the method of least squares. The broken line has a 
Slope of one, and passes through the origin; it represents the 
performance of perfect matching. (From W. M. Baum, 1973b. © 
1973 by the Society for the Experimental Analysis of Behavior 
Inc.) 
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the red and green lights. While there was considerable 
variability, the least-squares fit of the points on Figure 
15 closely matches the line indicating a matching rela- 
tion. The birds’ relative times spent on the two sides 
of the chamber, excluding shock-free periods and 
changeover periods, matched the relative frequencies 
with which cued shock-free periods were produced on 
the two sides, again excluding the shock-free periods 
themselves. 

In introducing this experiment and in discussing 
its results, Baum made the plausible assumption that 
with a constant shock frequency in the shock situation, 
and a constant duration of shock-free periods, fre- 
quency of escape from the shock situation is directly 
related to magnitude of shock-frequency reduction. 
Thus he interpreted the matching relation shown in 
Figure 15 as showing reduction in rate of aversive 
stimulation to be the critical parameter of, or even 
more strongly, the definition of, negative reinforce- 
ment. He also stated that it demonstrated a quantita- 
tive function relating behavior to shock-frequency re- 
duction that was comparable to that obtained by de 
Villiers (1972, 1974) which J described earlier. The 
problem with this interpretation is that in Baum’s 
procedure the frequency with which shock-free periods 
were initiated in the presence of the red or the green 
light, and from which the independent variable of 
Figure 15 was derived, was not proportional to shock- 
frequency reduction in the presence of those stimuli. 
If we simply look at shock frequency in the presence 
of the red light versus that in the presence of the 
green light, disregarding shock-free periods as Baum 
did when computing the relative reinforcement rates, 
we find no difference. ‘They were both equal to one 
shock per second, If instead the shock-fres periods are 
included when computing shock frequencies on the 
two sides, relative shock-frequency yveduction does 
equal the relative frequency of shock-frec periods, 
provided that this latter measure also includes the 
durations of the shock-free periods. On this basis it is 
possible for the birds’ relative time allocations to 
match the relative reductions in shock frequency. 
However, a plot comparable to Figure 15, but using 
shock-frequency reductions computed from the data 
given in the appendix of Baum’s article in place of 
the frequency of timeout production, gave a much less 
orderly function. Thus, the measures plotted in 
Figure 15 were well chosen, but the resulting function 
does not show the relation of time allocation to mag- 
nitude of shock-frequency reduction. Rather, the 
birds’ relative time allocations matched the relative 
frequencies, in the situations where shocks occurred, 
of onset of highly discriminable shock-free intervals. 
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Reinforcement Achieved by Shock Reduction 
Within a Situation, Contrasted with Reinforcement 


by Change of Situations 


The variable-cycle shock-deletion procedures used 
as components of multiple schedules by de Villiers 
(1972, 1974) identified shock-frequency reduction as a 
controlling variable, as I described earlier. The con- 
trast between de Villiers’s experiments and the one by 
Baum that I have just described, points up two dis- 
tinct modes of negative reinforcement. In de Villiers’s 
experiments, as in other multiple schedules, the pres- 
ence of a given cue could be said to define a situation. 
Responding within that situation reduced the shock 
frequency within the situation but did not remove the 
situation itself. In Baum’s experiment, responding did 
not affect shock frequency in the situation itself: 
rather, it occasionally produced a change of situation, 
to one in which shock frequency was lower. It is 
tempting to identify these twe modes of reinforcement 
with the familiar labels of avoidance and escape. but 
they do not conform to these labels as commonly 
understood. ‘Che effects of both can be based on aver- 
aging events over time. Neither is reducible te the 
other, although there must be boundary procedures 
that resemble both, when the response-produced stim- 
ulus change is not clearly discriminable. In some ways, 
the relationship between these twe modes of reinferce- 
ment resembles the relationship between magnitude 
and frequency of positive reinforcement, as follows. 

Herrnstéin (1970) pointed out that the matching 
relation for relative frequency of positive reinforce: 
ment can be expected to hold only if reinforcement 
Magiutudes are equal, or if some mathematical adjust 
ment is made to deal with their inequality. In addi: 
tion Herrnstein sugeected that a matching funetion 
obtameéd with equal reinforcement Irequencics bug 
varied reinforcement magnitudes, could be used for 
the scaling of reinforcement magnitude to make valid 
such a mathematical adjustment. Very tentatively, the 
analogous treatment of negative reinforcement can be 
spelled out by accepting Baum’s specification of fre- 
quency of negative reinforcement in terms of the 
frequency with which shock-free periods (changes of 
situation) are initiated. Magnitude of reinforcement 
could be either the duration of shock-free period or 
the magnitude of change in shock schedule or in shock 
intensity that accompanies the change of situation. 

In addition, the shock frequencies and intensities 
prior to change of situation could be treated as magni- 
tudes of drive operation, tightly defined as operations 
that potentiate reinforcement operations. De Villiers’s 
experiments with multiple schedules could be seen as 
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potential scaling procedures to permit the use of vary- 
ing magnitudes of reinforcement within Baum’s pro- 
cedure. However, we must still contend with the dis- 
tinction between magnitude of reinforcement as 
achieved by varying background shock frequency and 
as achieved by holding that frequency constant and 
varying response-produced lower frequencies. 


MULTIPLE RESPONSE PATTERNS PRODUCED 
BY SHOCK-DELAY PROCEDURES 


The experiments described above identify con- 
trolling features of negative reinforcement mainly 
through long-term parametric relations between these 
features and overall response rates. A more direct ap- 
proach has also been used, based on manipulation 
and observation of short-term response patterns, 
rather than on curve-fitting with overall rates. For 
example, using a shock-delay procedure Sidman (1966) 
found that when the shock-shock and response-shock 
intervals differ (SS=20, RS=40), the spacing of re- 
sponses within those intervals differs accordingly. 
Thus differing RS and SS intervals are discriminable, 
and as they alternate the animals respond appro- 
priately to each. When the SS interval approaches 
zero, and thus is extremely short relative to the RS 
intérval, responses during the shock-shock interval can 
be said to remove shock, producing a distinct change 
of situation. Responses during the RS interval are 
maintained on a much lower range of the shock- 
density continuum; their consequent shock-delay or 
shock-frequency reduction can be characterized as 
further reduction within a less aversive situation. 

A distinctive pattern of responding has frequently 
been noted that is related to, although not entirely 
confined to post-shock periods. This responding has 
been characterized as “bursting.” Ellen and Wilson 
(1964) described it in detail, contrasting it with the 
spaced responding or “‘continual-responding pattern” 
characteristic of the response-shock interval. There 
have been repeated suggestions that this burst pattern 
is adventitiously maintained by shock removal, an 
interpretation well supported by an experiment by 
Boren (1961) which illustrates another method for 
separate study of shock-removal and shock-delay. He 
introduced a second lever into a shock-delay situation. 
One lever worked only during the response-shock 
interval; the other lever worked only during the 
shock-shock interval. Using a shock-shock interval of 
zero (continuous shock), Boren found that the burst 
pattern, even though occurring partly in the RS inter- 
val, was entirely confined to the shock-removal lever, 
the one operative only during the (zero) SS interval. 
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T’he use of two concurrently recordable responses per- 
mitted a dissociation of effects of reinforcement by 
shock-removal from effects of reinforcement by shock- 
delay even when some of these effects (bursts) were not 
explicitly prescribed by the schedule. By itself this 
experiment requires a distinction between the two re- 
sponse classes only because they are controlled by 
different parts of the shock-frequency continuum. 
However, it complements the preceding experiments 
by de Villiers and by Baum, supporting a distinction 
between change of situation and reduction of aversive- 
ness within a situation. 


CONCURRENT SCHEDULES, WITH 
REINFORCEMENT BY SHOCK-DELAY 

AND BY REMOVAL OF THE SHOCK-DELAY 
PROCEDURE 


Verhave (1962) explicitly arranged for concurrent 
but separate control of behavior by these two aspects 
of negative reinforcement. He used several proce- 
dures; in all of them, responses could delay shocks 
(SS=3, RS=30). In each, some responses could pro- 
duce a “timeout” period, accompanied by a tone, in 
which both the shocks and the shock-delay contin- 
gency were absent. In an initial experiment the onset 
of the timeout period was made contingent upon the 
same lever-press that delayed shock, but with a fixed- 
interval schedule. Verhave found no indications of 
control by the fixed-interyal schedule. However, in a 
later experiment where the cue-correlated timeout 
periods were made contingent upon pressing a second 
lever, they had very clear effects. In this experiment, 
initial training was accomplished by partitioning the 
conditioning chamber to provide access only to the 
shock-delay lever. After responding was stable on this 
procedure, and after irregularly spaced tone presenta- 
tions to ensure that the tone by itself had no effect on 
responding, the second lever was made accessible. De- 
pressions of this lever produced 10-min timeout pe- 
riods accompanied by the tone. Figure 16 portrays 
some of the results with the cumulative records indi- 
cating responses and shocks on the shock-delay lever. 
Downward deflections of the underlying event records 
indicate timeout periods, each produced by a response 
on the second lever. ‘The first panels show the first 
session in which responding could produce the time- 
out periods. ‘The first response-produced timeout oc- 


3 Within the present chapter the term “timeout” is reserved 
for shock-free periods that accompany suspension of a shock- 
delay contingency that would permit responding to extend 
shock-free intervals. It should be recognized that other authors 
have applied the term to any clearly discriminable shock-free 
period. 
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Fig. 16. Cumulative records showing performance on a shock-delay procedure, with 
associated event records showing performance on a lever that produced timeout from 
the shock-delay procedure. Shocks are indicated by pips on the cumulative record. Time- 
out periods are indicated by downward deflections of the event record. The top two 
panels show the fourth session of shock-delay training, the first session in which the 
timeout procedure was in effect. The middle pair of panels shows performance during 
the fourth session in which the timeout procedure was in effect; the bottom pair shows 
performance in the next session, during which the timeout procedure was discontinued, 
producing extinction on the second lever while the shock-delay procedure continued. 
(After Verhave, 1962. © 1962 by the Society for the Experimental Analysis of Behavior, Inc.) 


curred very early in the session, perhaps facilitated by 
the similarity of the two levers located under identical 
pilot lights. Responding on the delay lever persisted 
undiminished throughout that timeout period. This 
responding during timeout did not begin to diminish 
until the fourth and fifth timeouts, which followed 
one another closely, suggesting a strengthening of re- 
sponding on the timeout lever. Response rates during 
timeout periods did not become reliably lower until 
close to the 11th timeout, which was after more than 


100 minutes of exposure to the timeout situation. In 
general, responding on the delay lever decreased and 
frequency of timeout production increased over the 
next few hours of conditioning; the seventh six-hour 
session indicates the typical pattern, more shock-delay 
responding and a slightly lower frequency of timeout 
production early rather than late in the session. When 
the timeout lever was deactivated, responding quickly 
recovered on the shock-delay lever (bottom 2 panels of 
Figure 16). In additional experiments, Verhave im- 
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posed ratio schedules on the timeout lever. When 
fewer than ten responses were required to produce a 
timeout, responding persisted on the timeout lever; 
when the ratio exceeded ten, two of three animals 
reverted to responding exclusively on the delay lever. 
So long as responding persisted on the timeout lever, 
enough accompanying responses were emitted on the 
delay lever that few shocks were received even though 
the ratio took some time to complete. 

Verhave concluded that for his subjects the timeout 
was an effective but rather weak reinforcer similar in 
its effects to food for a minimally deprived organism. 
It seems more directly analogous to a situation with 
effective deprivation but with two kinds of food rein- 
forcers available, one slightly preferred to the other, 
Sidman (1962c) studied the effects of a similar pro- 
cedure with monkeys. He found that responding was 
readily maintained when cued time-out periods were 
contingent upon higher ratios, but that there were 
substantial interactions between responding on the 
shock-delay lever and the chain-pull response that 
produced timeout periods. It is likely that the animals’ 
prior histories of positive reinforcement were impor- 
tant in maintaining the behavior with higher ratios. 
The species difference may also have contributed to 
the differing persistence on ratio schedules. 


CONCURRENT REINFORCEMENT 
WITH STIMULI DENOTING 
SHOCK-FREE PERIODS 


The notion of a shock-free situation has little 
meaning except in the context of other contrasting 
situations where shocks occur at least sometimes. This 
fact and its significance have been stressed by Rescorla 
(1967), mainly in relation to Pavlovian conditioning. 
He has advocated replacing traditional stimulus-pair- 
ing procedures with stimulus-correlation procedures, 
In procedures where cue presentation is determined 
independently of shock presentation, cues have zero 
correlation with shock, as do cues in experiments 
where there are no shocks. In only a trivial sense do 
the latter denote shock-free periods. Positively corre- 
lated cues are those for which the probability of shock 
is greater in their presence (or immediately following 
their occurrence) than in their absence. Negatively 
correlated cues include those I have already described 
as cues correlated with or denoting shock-free periods. 
Shock probability is greater in their absence than in 
their presence. Frequently, stimulus correlation pro- 
cedures are accomplished apart from operant rein- 
forcement procedures, but assessed through their 
effects on baselines of operant behavior. In the first 
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of these that I will describe, the effects to be measured 
were concurrent reinforcing effects superimposed on 
shock-delay procedures. 

As noted above, Verhave (1962) found that a sepa- 
rate response, rather than the shock-delay response, 
was needed for demonstrating concurrent reinforcing 
effects of a timeout stimulus correlated with absence 
both of the shock-delay contingency and shock. Res- 
corla (1969) used a two-response procedure to demon- 
strate that at least to some degree responding main- 
tained by shock-delay is sensitive to concurrent 
reinforcing effects of a stimulus correlated only with 
absence of shocks. Using dogs, he established panel- 
press responding with the shock-delay procedure 
(SS=10, RS=30). Responding on either of two panels 
could eliminate all shocks, but responding exclusively 
on either pancl for two minutes made that panel in- 
operative until a response had occurred on the other 
panel, ‘Then in three pretest sessions with the delay 
procedure still in effect but without the two-min 
limitation, he made brief (0.5 seconds) tones of 400 
Hz and 1200 Hz each separately contingent on presses 
of different panels, with the positions reversed midway 
through the pretesting period. Panel pressing was un- 
affected by these tone presentations, however, five 
intervening sessions of conditioning with a Pavlovian 
procedure changed this. With the response panels 
removed, a 2-sec darkening of the chamber was some- 
times followed by shock 8 sec later; at other times it 
was followed in 7 sec by a tone, but with shock 
omitted. Negatively correlated with shock, the tone 
was either 400 Hz or 1200 Hz, depending on the 
animal. Subsequently the animals were re-exposed to 
the same procedure used during pretesting, with panel 
presses producing brief tone presentations as well as 
shock-delay. During this testing, responding clearly 
predominated on the panel that produced the tone 
that for the given animal had been negatively corre- 
lated with shock in the Pavlovian procedure. The 
stimulus denoting absence of shock had a reinforcing 
effect revealed in response preference. However, con- 
sistent with Verhave’s (1962) results when using a 
single response, the reinforcing effect was not evident 
in absolute rates. 

Weisman and Litner (1969) devised a more potent 
procedure for demonstrating reinforcing effects of a 
stimulus whose negative correlation with shock was 
established during separate, Pavlovian conditioning 
sessions. The stimulus had clear concurrently rein- 
forcing effects on a single response maintained by a 
shock-delay schedule. Using rats, they established a 
wheel-turning response with a delay procedure of 
SS=5, RS=20 sec, and then divided their subjects 
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Fig. 17. Mean responses per minute for baseline. differential 
reinforcement of high rate (drh), differential reinforcement of 
low rate (drl), and extinction sessions. These reinforcement 
contingencies refer to the presentation of a tone. contingent 
upon changes in rates of responding that was, all the while, 
maintained by a shock-delay procedure. During intervening 
sessions in a separate chamber, the tone was presented in differ- 
ent relations to shock for two of the three groups: randomly, 
or unrelated for TRC; negatively correlated, for CS-. For the 
third group (no GS) only shocks were presented during the in- 
tervening sessions. (After Weisman & Litner, 1969. © 1969 by the 
American Psychological Association. Reprinted by permission.) 


into three groups, run on identical procedures during 
alternate “test” days. but differing in their treatments 
on the intervening days. For one group (CS—, using 
Pavlovian terminology in Figure 17), 5-scc prescnta- 
tions of a 400 Hz tone were correlated with shock-free 
periods. For another (TRC in Figure 17, for “truly 
random control” after Rescorla (1967), tones and 
shocks were presented independently, so that the 
probability of shock was equal in cither presence or 
absence of tone. The third group received the same 
number of shocks, but the tone was not presented 
(“No CS”) on the intervening days in the second 
chamber. 

During the test sessions, which included continued 
training with the shock-delay procedure, presentations 
of the tone were made contingent upon changes in 
response rate. First came differential reinforcement of 
high rates (drh). During the first 30 min of the initial 
drh session the number of responses per 5 seconds re- 
quired to produce the tone was gradually increased 
from 4 to 10, where it remained. During the final 30 
min of the second test session and test sessions 3—5, 
the criterion of 10 or more responses per 5 sec re- 
mained in effect. During the shock-delay sessions on 
test days 6 and 7 the tone was not presented. These 
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sessions were included to observe extinction of any 
conditioned reinforcement effects generated under the 
drh procedure. 

During test sessions 8-11, if zero or one response 
occurred in a 5-sec interval the 400-Hz tone was pre- 
sented in the succeeding 5-sec interval. Thus, during 
these sessions responding at a low rate produced the 
tone. As with the drh procedure, only the rate of re- 
sponding in the absence of the tone could affect the 
presentation of the tone, but responses occurring at 
any time during the session postponed shock. Again, 
during shock-delay sessions administered on test days 
12 and 13 the tone was not presented. These sessions 
were included to observe extinction of any condi- 
tioned reinforcement effects generated during the drl 
procedure. 

AS shown in Figure 17, for rats exposed to tones 
denoting shock-free periods during intervening ses- 
8105, lever-pressing maintained by the shock-delay 
contingency was increased by the drh schedule and 
decreased by the drl schedule of tone presentation. 
Rats for which the tone had not been correlated with 
absence of shock in the intervening days did not show 
any such reinforcing cllects of contingent tene presen- 
tation. 

Dinsmoor and Sears (1973) used dimensional con- 
trol by an added cue to révéal réinforcing effects of 
the cue correlated with sheck-free periods. In addi- 
tion, the cue’s negative correlation with shock was 
accomplished by presenting it in synchrony with a 
shock-délay procedure by a technique similar to one 
developed by Rescorla (1968). Dinsmoor and Sears 
used pigeons, and the shock-delay procedure (SS = 5, 
RS = 20 sec) contingent upen treadle-pressing, In 
addition to the 20-sec response-shock interval, each 
response produced a 1000 Hz tone of 5-séc duration, 
which ensured repeated presentation of the tone dur- 
ing shock-free periods. After 58 training sessions, a 
test procedure was introduced during the final 60 min 
of every third 90-min session. In this testing, tones of 
various frequencies were made contingent upon the 
response; responding was also examined in additional 
sessions where no tones were presented. As indicated 
by Figure 18, all birds responded maximally when the 
response-produced tone had a frequency of 1000 Hz, 
as in training. For two of the three birds, response 
rates dropped substantially when the frequency of 
the tone was varied. Across birds, the degree of decre- 
ment produced by changing tone frequency was cor- 
related with the degree of decrement observed when 
the tone was deleted entirely (open data points in 
Figure 18). Noncontingent presentations of the 1000 
Hz tone resulted in relatively little responding. The 
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Fig. 18. The mean number of times the tesponse-contingent 
tone was produced by each of three pigeons during the final 60 
min of cach test session, in which tones of various frequencies 


were presented, contingent on responses. The open symbols 
indicate 49 comparable tieasure for periods when no ton¢e was 


presented. Training occurred with the tone of 1000 Hz; other 


frequensics uscd during testing are also indicated on the 
Abscissa. (After Dinsmoor and Sears, 1973,) 


eflects of varying the frequency of these response- 
produced tones are not readily attributable to the 
degree of stimulus change produced by responding, 
or to the degree of difference between stimuli cor 
related with the presence of shock (silence) and the 
response-produced stimulus. Rather, they are attribut- 
able to the physical differences between the response- 
produced stimuli and the stimulus that was previ- 
ously correlated with shock-free periods immediately 
following responses. 

Fach of the four preceding experiments has dem- 
onstrated simultaneous control of behavior by shock- 
delay (shock-frequency reduction without distinctive 
change of situation) and by reinforcement based on 
clearly discriminable changes of situation. Verhave 
(1962) found that a response independent of the 
shock-delay contingency was sensitive to concurrent 
reinforcing effects of stimuli correlated with absence 
of shocks (and of the shock-delay contingency), while 
the shock-delay response was insensitive to these 
effects. Rescorla (1969) found a choice measure to be 
more sensitive than absolute response rate measures 
for demonstrating concurrent reinforcement effects. 
However, Weisman and Litner (1969) showed that if 
the cues correlated with shock-free periods are made 
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contingent specifically upon changes in response rate, 
the delay-maintained response itself is sensitive to 
reinforcement by the cue correlated with shock-free 
periods. In these last two experiments the correlations 
between cues and shock-free periods were established 
and maintained in separate training sessions, and thus 
required special test sessions to demonstrate their 
effects. Dinsmoor and Sears (1973) demonstrated a 
method for assessing concurrent reinforcement effects 
of a stimulus whose negative correlation with shock 
was established within the shock-delay procedure it- 
self, although the stimulus change always coincided 
with the initiation of shock-delay. 


Discriminative Properties of Cues Added 
to Shock-Dalay Proeadureas 


I have been emphasizing the reinforcing properties 
of stimuli denoting shock-free periods. At the same 
time, discriminative properties, revealed by behavior 
in the presence of the stimuli, have sometimes been 
observed, These appeared to be closely correlated 
with the reinforcing properties. For example, Dins- 
moor and Sears (1973), when gathering the data 
shown above in Figure 18, found that the frequency 
of responding in the presence of a given tone fre- 
quency was inversely related to the response rate 
maintained by response-contingent presentation of 
that tone frequency. A similar, though less systemati- 
cally examined effect was observed in the experiment 
by Verhave (1962) on reinforcing effects of timeout 
from a shock-delay procedure. There, responding on 
the shock-delay lever during timeout from the shock- 
delay procedure decreased as responding increased on 
the separate lever that produced the timeout periods, 
This relationship appears roughly similar for both 
Dinsmoor and Sears’s and Verhave’s experiments 
even though the durations of the stimuli denoting 
shock-free periods differed drastically—5 sec versus 10 
min—and one denoted absence of an entire pro- 
cedure while the other denoted the momentary ab- 
sence of shock within a similar procedure. 

In the experiments that follow, discriminative 
effects of cues paired with presence and absence of 
shock were the main focus of study. 


CuES SUPERIMPOSED ON 
SHOCK-DELAY PROCEDURES 


Rescorla and LoLordo (1965), with dogs as subjects, 
found that when a cue had been correlated with 
shock-free periods in a Separate situation, superim- 
posing the cue on a baseline of shock-delay respond- 
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ing produced a decrease in responding in its presence. 
This finding was replicated in rats by Weisman and 
Litner (1969). With analogous procedures these two 
studies also demonstrated increased response rates in 
the presence of stimuli that had separately been pre- 
sented in positive correlation with shock. Extrapola- 
tions based on these results must be done with caution, 
for Pomerleau (1970) showed that either enhance- 
ment or suppression could be produced by a cue 
paired with shock, depending on the duration of the 
cue. Brief cues apparently produce the simplest effects, 
since they are less affected by the shock frequency 
produced by the ongoing shock-delay schedule. 

OFf special interest for the present analysis is an 
experiment by Rescorla (1968) demonstrating effects 
of cueshock correlations established within an on- 
going shock-delay procedure. Using different groups 
Rescorla trained dogs with an SS interval of 10 sec 
and an RS interval of 30 sec contingent upon a free- 
operant shuttle response. For some animals a tone was 
presented for 5 sec alter cach response; for others, the 
5-sec tone came on 25 sec after a response. For a third 
sroup of dogs the 5-sec tones were presented at rans 
dom. None of these tone presentations had any ap- 
parent effects on acquisition, but when the tones were 
then deleted while the shock-delay procedure re- 
mained in effect, response rates increased for the 
group that had received tone immediately after re- 
sponses. Finally, the shock-delay procedure was sus- 
pended (no shocks delivered) and the tenes were pre- 
sented noncontingently. Greatly decreased response 
rates were observed in the presence of the tone for 
animals that had previously received tones imméd1- 
ately after responses. In contrast, animals that had 
previously received the tone 25 sec after responses 
showed greatly enhanced response rates in the presence 
of the tone. 

Rescorla interpreted his results as supporting tradi- 
tional two-factor avoidance theory, in which a clas- 
sically-conditioned response termed “fear” is said to 
generate the operant response, and removal or inhibi- 
tion of that classically-conditioned response is said to 
reinforce the operant response. ‘Thus, the tone im- 
mediately following responses would be a fear inhibi- 
tor, and that presented just before shock (25 sec after 
response) would be a fear elicitor, each with appro- 
priate effects on shock-delay responding. Others (e.g. 
Anger, 1963) have argued that on this type of pro- 
cedure the cue itself becomes aversive through its 
correlation with shock. It then generates the response, 
with its removal reinforcing the response. Either way, 
the experiments just described (Rescorla, 1968, 1969; 
Weisman and Litner, 1969), demonstrated results 
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consistent with the two-factor view. Clearly, Pavlo- 
vian-cue relations are imbedded in cued negative rein- 
forcement procedures. We have seen that the role of 
such cue relations can profitably be studied with the 
aid of Pavlovian procedures applied outside the nega- 
tive reinforcement sessions, and some of their results 
can be predicted from principles of Pavlovian condi- 
tioning. 

However, the conditioned aversive or “fear gen- 
erating,” and thus response-generating role of cues 
correlated with shock has come into question. For ex- 
ample, Bolles, Stokes, and Younger (1966) and Bolles 
and Grossen (1969) found that responding was not 
very effectively maintained by removal of a stimulus 
paired with shock unless the shock was also deleted. 
Kamin, Brimer, and Black (1963) used a separate 
procedure outside a negative reinforcement situation 
to independently examine conditioned properties of a 
cue that was paired with shock in the negative rein- 
forcement situation. This separate procedure was the 
well-known “conditioned anxiety,’ or conditioned 
suppression procedure of Estes and Skinner (1941), 
Using subjects at various stages of training with nega- 
tive reinforcement, Kamin et al. took the cue that had 
accompamied shock during that training, ahd pré- 
sented it noncontingently during feed-reinferced re- 
sponding. The cue produced least suppression of 
appetitive responding in animals for which the cue 
was most éfféctive far Maintaining népatively rein- 
forced responding. This result runs strengly counter 
to predictions of the two-factor formulation, which 
would require that a stimulus producing greater fear 
would be more effective, both for negativé réinforcé- 


ment and for suppression of appetitive behavior. 


Cuess INSERTED IN SHocR-DELAY 
PROCEDURES: ““WARNING STIMULI” 


The most telling evidence against the traditional 
two-factor formulation comes from procedures in 
which preshock cues, often called ‘‘warning stimuli,” 
are introduced into shock-delay procedures. Some of 
these experiments have not received the attention 
they deserve, so they are presented in detail here. First 
was an experiment by Sidman (1955) who pretrained 
cats and rats in his noncued shock-delay procedure 
(SS = 20, RS = 20) and then introduced a 5-sec pre- 
shock cue (light) that could be either delayed or re- 
moved. ‘That is, responses less than 15 sec apart de- 
layed both the onset of the light and of the shock (RL 
interval = 15 sec; RS interval = 20 sec). Responses 
in the presence of the light terminated it and delayed 
the shock. If no response occurred the light terminated 
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with the brief shock, initiating shock-light and shock- 
shock intervals identical to the response-light and 
response-shock intervals. With this procedure most of 
the responding came to occur in the presence of the 
light, with only 25 to 30 percent of the responses 
typically occurring in its absence, mainly in post- 
shock bursts. There was some tendency for response 
rates to increase as time for the light onset approached, 
but these rates were very low relative to response 
rates in the presence of the light. Ulrich, Holz, and 
Azrin (1964) used a similar procedure with a buzzer 
instead of a light. The response-buzzer interval was 
15 sec; the response-shock interval was 20 sec, both 
comparable to Sidman’s experiment. They used a 
shorter shock-shock interval of five sec. In the absence 
of responding the buzzer was on continuously, accom- 
panied by a shock every five seconds. They found an 
éven sréater concentration of responding in the pres- 
ence of the pre-sshock cue, with only a very low, con- 
stant rate Of responding in its absence. When they 
shertened the response-buzzcr interval, responding 
continued to be concentrated in the period immedi- 
ately after buzzer onset. 

If the preshock cuc were to be considered a con- 
ditioned aversive stimulus whose removal could rein- 
forea responding, one might expect that on these pro- 
cedures the animals’ responding would be maintained 
by its delay as well as by its removal, comparable to 
responding maintained by shock-delay when no cue is 
provided, That did not happen; the animals usually 
did not delay the warning stimulus. To preserve the 
notion of conditioned aversive stimuli one could say 
that the warning stimulus became only moderately 
aversive. Its removal could maintain responding, but 
its delay could not. 

Whatever its plausibility this reinterpretation is 
challenged by an experiment by Ficld and Boren 
(1963) who provided “warning stimuli for warning 
stimuli.” ‘Their underlying procedure was an “adjust- 
ing avoidance schedule” (Sidman, 1962d), where in 
the absence of responding brief shocks occurred every 
5 sec, and each response produced 10 sec of shock-free 
time. These 10-sec periods were cumulative up to 100 
sec. Field and Boren used two sets of stimuli in this 
procedure. One was a series of 11 pilot lights spaced 
evenly in a line on the wall above the lever. Each 
lamp in ordinal sequence accompanied a specific 10- 
sec period within the 100-sec range. The 11th light, 
directly above the lever, was on during the shock- 
shock interval. The light most distant from the lever 
was on when the next shock was not due to occur for 
100 sec. The second set of 11 stimuli was a series of 
click rates ranging from 57.7 per sec during the shock- 


NEGATIVE REINFORCEMENT AND AVOIDANCE 


shock interval, to zero when shock was 100 sec dis- 
tant. Figure 19 shows typical performances with audi- 
tory and visual stimuli combined, with visual and 
auditory stimuli separated, and with no added warn- 
ing stimuli. With both sets of stimuli (top panel), the 
rat typically responded so that shock was kept 30 to 50 
sec away. When there were no added stimuli (bottom 
panel), shock was typically kept 90 to 100 sec away, 
with occasional lapses. Most interesting are the inter- 
mediate cases: performance with the auditory stimuli 
was very similar to that with both auditory and visual 
stimuli; performance with light alone was intermedi- 
ate between that for both stimuli and that for no 
added cues. Light plus clicker can reasonably be 
assumed to be a more distinctive cue than light alone. 
Thus, with the more salient or discriminable stimu- 
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Fig. 19. Strip chart records showing one rat’s responding with 
both classes of multiple warning stimuli, with each class of 
stimuli alone, and with no warning stimuli. The pen records 
the position of the stepping relay, hence the temporal proximity 
of the shock is indicated in 10-sec increments. On these records 
time runs from right to left. (From Field & Boren, 1963.) 
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lus changes, the animals maintained a closer prox- 
imity to shock. Field and Boren interpreted their 
results in terms of degree of stimulus control, and of 
stimulus generalization. They argued that with more 
highly discriminable stimuli, characterized by a rela- 
tively sharp gradient, the animals could maintain the 
relatively closer proximity to shock without incurring 
an increase in shock frequency. An interpretation in 
terms of conditioned aversive stimuli could probably 
handle these results with certain added assumptions. 
However, a simple and direct application of this ap- 
proach would seem to predict the opposite results. 
When more clearly discriminable stimulus changes 
were provided one might expect facilitation of respond- 
ing that prevented the conditioned aversive stim- 
uli. Instead, the responding shifted toward stimuli 
more closely associated with shock. 

Grabowski and Thompson (1972) obtained similar 
results with monkeys. ‘They used lights correlated with 
successive time segments of SS and RS intervals on 
conventional shock-delay procedures. 

Additional experiments by Sidman and Boren (1957 
b & c) and by Sidman (1957) more directly illustrate 
the discriminative rather than aversive role of pre- 
shock stimuli inserted into shock-delay procedures. In 
one of these, Sidman and Boren (1957b) compared 
performance on two procedures, one of which was 
described just above (Sidman, 1955): On that pro- 
cedure, responding that delayed shock could also 
delay or remove a tone presented 5 sec before a shock 
was due. As noted above. most responding occurred 
in the presence of the tone instead of delaying it. The 
comparison procedure also used shock-delay and cuc- 
delay, with pretraining on a conventional procedure 
whose shock-shock and response-shock intervals both 
equal to 20 sec. Similar to the above procedure, a 
4-sec light preceded all shocks; responses in its absence 
delayed its onset by 16 sec while delaying shock by 
20 sec. However, responses in the presence of the 
light had no effect either on light or on shock; once 
the light came on the impending shock was unavoid- 
able. When this procedure was introduced to animals 
pretrained with no warning stimuli, responding in the 
dark, which was now the only effective responding, 
initially increased but then decreased over sessions. 
The final stable levels were not systematically related 
to response rates during initial conditioning on the 
uncued shock-delay procedure; some rates were higher 
than initially, others were lower. Any responding that 
initially occurred in the presence of the light de- 
creased systematically to very low levels. Additional 
control procedures assessed possible effects of variable 
light durations and varying correlations of light and 
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shock in the comparison procedure where responses 
could terminate the light. Taking all of these into ac- 
count, it was clear that while both procedures satisfied 
the conditions for formation of a Pavlovian discrimi- 
nation, only in the case of the cue denoting unavoid- 
able shock did the cue function as a conditioned aver- 
sive stimulus. However, even there we could treat it 
just as a stimulus correlated with the suspension of 
the shock-delay contingency: responding in its pres- 
ence decreased; responding in its absence could still 
delay shock, and that responding persisted. Absence of 
the light in this case, was a stimulus correlated with 
access to reinforcement, and responding persisted in 
the presence of that stimulus even though it was not 
paired with shock. 

A second experiment by Sidman and Boren (1957c) 
argues ¢yen more strongly for discriminative proper- 
ties of a preshock cue within delay procedures. After 
pretraining on the standard noncued shock-delay pro- 
céduré with SS and RS intérvals of 80 8c, a Ciie was 
added. So long as 15 sec did not elapse between re- 
sponses, it was identical to the familiar cued shock- 
delay procedure (Sidman, 1955). Responses could 
delay the onset of the cue light, as determined by a 
response-light interval of I5 sec analogous to the 
usual sheck-delay contingency. However, if the an- 
imal paused, allowing the cue light to come on, a 
response-shock interval of 5 sec was in effect such that 
résponses délayéd the shock but also éxtendéd the 
duration ef the cus ght. With a sheck, the sus hght 
was turned off, reinstating the 15sec response-light 
contingency. With the transition to this schedule from 
uncucd shock-delay (S$ = RS = 20 sec), responding 
in the presence of the light increased temporarily over 
that of the comparable time periods in the pretrain- 
ing schedule and then dropped well below that in 
the dark. For three of four rats, responding in the 
dark on the 15-sec light-delay schedule increased 
slightly over that for comparable time periods in the 
prior schedule. For all rats the terminal response rate 
in the absence of the light substantially exceeded that 
in the presence of the light. In short, the animals fre- 
quently waited through the light, took the shock, and 
then responded more rapidly, delaying the renewed 
onset of the light. 

This result suggests that the low response rates 
(pauses) in the presence of the cue light in the sequen- 
tial procedure were reinforced by termination of the 
stimulus and its associated contingency, even though 
the shock whose reduction provided the basis for 
maintaining the whole performance accompanied the 
termination of that light. 

The shift of responding within a sequential pro- 
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cedure, from a more demanding to a less demanding 
component is reminiscent of the effects observed by 
Krasnegor, Brady, and Findley (1971), which were de- 
scribed earlier in the present chapter. It indicates that 
the reinforcing or aversive properties of a cue-defined 
situation can be strongly affected by the scheduled 
reinforcement contingencies in that situation, apart 
from correlated shock delivery per se. The similarity 
of the two experiments is revealed by parametric 
manipulations with the Sidman and Boren procedure. 
These manipulations, reported by Sidman (1957), 
make the controlling features of that procedure even 
more clear. First, the response-light interval was held 
constant while the response-shock interval, operative 
in the presence of the light, was manipulated. Re 
sponse rates varied in both the presence and absence 
of the light, so that more time was spent in the con- 
dition permitting lower response rates. Complemen- 
tary results are shown in Figure 20; these were ob- 
tained by holding the response-shock interval constant 
at 10 see and varying the response-light interval from 
10 to 20 sec. As the figure shows, when the two inter- 
vals were equal at 10 sec, most responding occurred in 
the presence of the light, directly delaying shock, Hew- 
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Fig. 20. Rate of responding in the light (L) and in the absence 
of the light (D) for rat AA-1 as a function of the interval by 
which responses in the dark could delay the light (the RL in- 
terval). The RS interval, by which responses during the light 
could delay shocks, was constant at 10 sec. (From Sidman, 1957 .) 
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ever, when the response-light interval was increased, 
performance changed and the animal spent most of 
its time in the dark, with responding again main- 
tained by associated less stringent performance require- 
ment. These functions are very similar to those of 
Krasnegor et al. (1971), shown in Figure 2. 

In a follow-up experiment, Sidman and Boren 
(1957b) moved animals from a schedule of this kind, 
with a RL interval of 15 sec and a RS interval of 10 
sec in the presence of the light, to a procedure that 
differed in that the cue light and the associated 
response-shock interval did not terminate with the 
first shock. Once on, the light remained on for five 
min, accompanied by ten-sec response-shock and shock- 
shock intervals, independently of shocks taken during 
this period. Response rates in the absence of the light 
and maintained by the light-delay contingency, were 
slightly higher than in the comparable procedure 
where cue presentation terminated whenever a shock 
was délivered. Response rates in the presence of the 
lights were very high relative to rates on the com- 
parable procedure where the light terminated with 
shock. Low rates were no longer reinforced by a re- 
turn to less stringent response requirements. 


‘THE SEVERAL PROPERTIES OF ADDED 
CUES, AND THerr ImPiicaTIONs 
FOR ‘I'wo-Factor THEory 


Returning to the terms of my earlier analysis of 
negative reinforcement procedures without added 
cues, the preshock, or “warning” stimulus added to a 
shock-delay or shock-deletion procedure is correlated 
With several key procedural features. The added cue 
accompanies the opportunity to respond, and is cor- 
related with a frequency or probability of shock. It 
also denotes periods when responding is especially 
effective, and thus is correlated with the availability 
of reinforcement. Accordingly, behavioral effects of a 
warning stimulus are confounded. The preshock cue 
cannot be seen as simply providing a classically con- 
ditioned surrogate for the shock, for several effects of 
the cue are independent of its relation to shock. A 
change in availability of reinforcement contingent 
upon behavior in the presence of a cue can markedly 
change that behavior. It can also change behavior 
that affects the onset of the cue. The manipulation of 
one correlated feature may affect changes in behavior 
directly, or indirectly by shifting control to a different 
correlated feature. In short, the familiar warning 
stimulus, traditionally used to simplify interpretation 
of negative reinforcement, is an added feature of 
great complexity. 
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Traditional two-factor theory of the form that 
postulates internal mediating responses classically con- 
ditioned to the warning stimulus, encounters the 
above problems and more. These conditioned re- 
sponses, with their own cue properties, must them- 
selves be multiply correlated with the various aspects 
of the conditioning procedure. Further, as Black 
(1971) has argued in exhaustive detail, there is no rell- 
able principle for a priori selection or identification 
of a mediating response. The closer one comes to 
identifying a physiological index of a classically 
conditioned response or process that might mediate 
the operant behavior, the greater the likelihood that 
the index is an instrumentally conditioned artifact of 
the operant procedure. At the same time, the less 
susceptible the proposed mediator is to problems of 
artifact, the less adequate it is to account for the 
operant behavior. In addition, if one is concerned 
with mediation of performance, a phasic reflex re- 
sponse is far too primitive to mediate the lawful but 
complex behavior produced by the procedures we 
have been considering. In principle the Pavlovian 
model can encompass more enduring responses. In 
practice a proposed diffuse but unitary process identi- 
hed by a facile label such as “‘fear’’ is likely to provide 
superficial interpretation. As Myer (1971) has argued, 
present evidence does not indicate that there is a 
single, unitary, mediating process based on pairing of 
stimuli with shock. 

This is not to belittle the role of cues in negative 
reinforcement procedures, nor to disregard all aspects 
of the Pavlovian paradigm as it applies to these pro- 
cedures. The correlation of cues with presence or ab- 
sence of shocks, as systematized by Rescorla and his 
colleagues (e.g. Rescorla & Wagner, 1972) provide the 
basis for the most comprehensive formulation of stim- 
ulus combination now available. However, instead of 
characterizing the positive or negative correlation of 
a cue with shock as production or inhibition of a 
Pavlovian response, it is proposed here that in operant 
behavior, the role of cues correlated with shocks 1s bet- 
ter characterized as affecting the averaging of shock 
over time. 

This averaging need not be construed in terms of 
the subject’s internal mechanism, but rather in terms 
of what methods for summarizing the independent 
variables lead to the simplest functional relationships. 
For example, comparison of periodic with aperiodic 
events suggests that equivalencies cannot be based on 
arithmetic averaging. In the case of concurrent chain 
schedules of positive reinforcement, an aperiodic se- 
quence of reinforcements is preferred over a periodic 
sequence of events with the same overall frequency 
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(e.g. Herrnstein, 1964); it appears that short interrein- 
forcement times have disproportionately great effects 
(see also Fantino, Chapter 11 in this volume). For the 
aversive case, Bolles and Popp (1964) found indica- 
tions that on a shock-delay procedure, a variable 
shock-shock interval produced acquisition more reli- 
ably than did a fixed shock-shock interval with 
approximately the same mean value. If this is inter- 
preted as effectively shortening the shock-shock in- 
terval, the results are comparable to those obtained 
by Leaf (1965), showing better acquisition with small 
shock-shock intervals. ‘Thus we have both an appett- 
tive and an aversive cxample suggesting that arith- 
metic averaging is inappropriate. 

Whatever the method of averaging, the onsets and 
offsets of added cues appear to enhance or even 
change the boundaries of time intervals over which 
intermittent cvents are effective, With respect to inter- 
mittent shocks accompanied by tones and lights, we 
might consider the onset or offsets of the added cues 
as influencing the behavioral effects of shock in much 
the same way that, in visual pattern perception bound- 
ary lines can influence the way that dots in the visual 
field are grouped. If the lines fall between areas with 
differing but homogeneous densities of dots, the 
differences betwéén these homogeneous areas will be 
enhanced. If on the ether hand, there is a continuous 
gradient of density of dots, drawing a line across the 
gradient will effectively separate the areas. The ob- 
server will réport the pattern not as 4 pradual pradi- 
ent, but rather as a group of areas with separate, 
more-or-less homogeneous but different densities. So, 
analogously, presentation or removal of a tone or 
light may influence the way that a series of brief ¢lec- 
tric shocks will be averaged. This function is implicit 
in my distinguishing between shock-frequency reduc- 
tion within a cue-defined situation and a change of 
cue-defined situations with differing shock frequencies. 
The onset or offset of a continuous cue provides a 
boundary for a group or groups of irregularly spaced 
shocks. 

The effect of cues on averaging of intermittent 
shocks is illustrated quite clearly in an experiment by 
Badia, Coker, and Harsh (1973). For initial training, 
they exposed rats to a variable-time schedule of brief, 
noncontingent, and nonreduceable shocks. ‘The prob- 
ability of shock was fairly constant over time, with a 
mean shock-shock interval of four min. In addition, 
each shock was preceded by a five-sec tone. After three 
sessions of exposure to this signaled shock procedure, 
shocks occurred without the tone unless the animal re- 
sponded. A response turned on a cue light for three 
minutes, during which shocks were preceded by the 
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tone. Quick acquisition of responding occurred once 
an animal made contact with this contingency. ‘The 
animals consistently responded, producing the situa- 
tion defined by the cue light, in which shocks were 
reliably preceded by the five-second tone. 

After stable responding was established, Badia et 
al. varied the shock frequency in the signaled cond- 
tion, using mean shock-shock intervals of 2, 1.0, or 
0.5 min. Meanwhile, they held the shock frequency in 
the unsignaled condition constant at one shock per 
four min. By not pressing the lever, the animals could 
always maintain this condition in which shock fre- 
quency remained at one shock per four min. All rats 
persisted in responding when signaled shocks averaged 
one per two min, and three of four animals persisted 
when signaled shocks averaged one per min. That is, 
responding was maintained by production of a situa- 
tion in which signaled shocks occurred four times as 
frequently as the unsignaled shocks that would occur 
in the absence of responding. A control experiment, 
with all shocks unsignaled, established that the yari- 
éus shock frequencies were discriminable by the an- 
imals, 

The interpretation given these results by Badia et 
al. wag that the addition ef presheck warning signals 
effectively provided substantial shock-free intervals in 
the absence of those signals. Most responses produced 
the light-correlated situation denoting shock-fre¢ peri- 
ods: thé averaging or integrating of intermittent 
shocks occurred only within the boundaries estab- 
lished by the warning stimuli. In addition, it is clear 
that the preshock “warning signals’ did not mcrease 
the aversiveness of the more molar situation defined 
by the présence of the cue light, as might be expected 
if they were to be interpreted simply in terms of con- 
ditioned aversiveness. To the contrary, they decreased 
the aversiveness of that situation. Characterized as 
affecting integration of shocks over time, they can be 
said to have produced discriminable shock-free inter- 
vals. Recall that Dinsmoor and Sears (1973) demon- 
strated dimensional control by stimuli denoting shock- 
free intervals, independently of stimuli denoting 
availability of shock or setting the occasion for the 
response. They, and a number of others (e.g. Bolles, 
1970; Rescorla, 1969) have argued for the potency of 
discriminable shock-free intervals in reinforcing be- 
havior. An attractive feature of the stimuli denoting 
shock-free periods for the analysis of behavior is not 
only their apparent potency as reinforcers, but also 
their comparatively low degree of confounding with 
various aspects of negative reinforcement procedures. 
Denoting “timeout” periods, their simplicity contrasts 
with other stimuli that are multiply correlated with 
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Opportunities to respond, access to reinforcement, and 
shock probability, frequency, or density. 


Cues Denoting Limited Response Opportunity, 
Apart from the Aversive Situation 


Free-operant procedures for examining negative 
reinforcement have permitted study of schedule effects 
that were inconceivable within traditional escape and 
avoidance procedures. They have produced diverse 
but orderly behavioral phenomena that complement 
those produced by positive reinforcement. At the same 
time, response patterns on schedules of negative rein- 
forcement have sometimes been attributed to contin- 
gencies or principles that are not readily manipulable 
or testable within a free-operant situation itself. The 
behavior often interacts with a schedule in such a 
way as to prohibit clean manipulation of the critical 
variable. Analysis of such contingencies has required 
a redeparture from the free-operant paradigm to spe- 
cial forms of limitcd-opportunity procedures. I shall 
examine the use of limited-opportunity schedules for 
addressing three questions regarding negative rein- 
forcement, One deals with the isolation of the con- 
tingency between response and shock, independently 
of shock frequency and responses rate. Another deals 
with the role of temporal discriminations in behavior 
on shock-delay procedures. The third question con- 
cerns the relative contributions of short-term shock- 
delay and long-term shock-frequency reduction. Inter- 
estingly, Shimp (1978) has argued for the positive 
reinforcement case that an analysis of the variables 
operative in standard schedules of reinforcement often 
requires a departure from those schedules. Some such 
analyses have used limited-opportunity procedures 
very much like those to be considered here (e.g. 
Jenkins, 1970). 


ISOLATING CONTINGENCY AND FREQUENCY 


When discussing shock-deletion procedures without 
added cues, I noted that a contingency between re- 
sponse and shock-deletion can be efficiently specified 
in terms of two probabilities. These are the probabil- 
ity of shock if a response occurs, and the probability 
of shock given no response. As pointed by Church 
(1969), Catania (1971) and several others, the degree of 
contingency can be described as the difference be- 
tween the two probabilities. However, in uncued free- 
operant procedures this specification tells only part 
of the story, for one must also specify the interval for 
which the probability operates. For example, in a 
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standard fixed-cycle shock-deletion procedure, the 
probabilities are specified in relation to the shock- 
shock interval. An example is shown in part I of 
Figure 7, where the probability of shock given a 
response is zero, and the probability given no response 
is one. In schedules defined under the t4 — t” sched- 
ules devised by Schoenfeld and his associates (e.g. part 
III of Figure 7; also, see Kadden, Schoenfeld, & Snap- 
per, 1974), the probabilities are specified in relation to 
t?, which is a portion of the shock-shock interval. 
Within either of these situations, the duration of the 
period for which the probabilities are specified has 
strong implications. That period determines both the 
maximum possible shock rate, and the minimum re- 
sponse rate required to achieve a given amount of 
shock reduction. In uncued situations these are both 
powerful variables. 

The confounding of rate with contingency can be 
mitigated by adding a cuc that coincides with the 
period for which the probabilities are specified. Gue 
duration is still a variable to be considered, but to 
some degree it can be moved to the background along 
with ovérall response and shock rates. If the cue is 
removed by a response, no more than one responsé 
can occur per cue presentation, The sheck can be 
limited to a Maximum of one per cue as well so that 
both can be described by probabilities instead of fre- 
quencies. 

Neflinger and Gibbon (1975) have reperted an ex- 
periment that separates frequency and contingency in 
this way. Using rats, they superimposed a tone on 4 
basic fixed-cycle shock-deletion procedure that was 
virtually identical to that diagrammed above in part 
I of Figure 7. With no responding, a 0.5 sec shock was 
delivered every 20 sec; a tone was on during the inter- 
vening 19.5 sec during which a response could affect 
shock. A response in the presence of the tone termi- 
nated the tone, and deleted the shock due at the end 
of that cycle. The tone came on again when the shock 
had been deleted, starting a new 20-sec cycle. 

The addition of the tone converted this to a dis- 
crete-trial procedure. At least Neffinger and Gibbon 
treated it as such, for they ignored responses in the 
absence of the tone. They trained their rats with the 
conventional probabilities: probability of a shock was 
zero if a response had occurred; probability of a shock 
given no response was one. Also, in this training they 
discarded approximately 70 percent of their animals, 
which failed to meet a stringent performance cri- 
terion. Then the two probabilities were manipulated 
independently, but with interspersed retraining at the 
original values. The left panel of Figure 21 shows the 
effects of decreasing the probability of shock given no 
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RESPONSE PROBABILITY 
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R= P(S|~R) P= Pp 


Fig. 21. The left panel shows response probability as a func= 
tion of the conditional prekability of shesk given ne respenas, 
P(Sk /~ R), while the probability of shock given A FESponse, Bp 
(SK/R). was constant at zero. The right panel shows response 
probabilities when the shock probabilities for responding and 
for not responding were aqual, at values greater than sero. 


Kach plat represents a different rat. Data are taken from the 
final three days of exposure to the schedule values, and the 


functions are plotted from left to right in the order in which 


the probability values were studisd, (After Neflingsr & Gibben, 
1995. (c) 1975 by the Society for the Experimental Analysis of 


Behavior, Inc.) 


response, while holding the probability of shock given 
a response constant at zero. With this manipulation, 
response probabilities remained rélativély high until 
the probability of shock given no response was very 
low. 

The right panel of Figure 21 shows the effects of 
subsequently eliminating the contingency, by making 
the probabilities of shock equal whether or not the 
rats responded. ‘The performances separated into two 
distinct classes: some animals quit responding; others 
responded persistently. With further analysis, Nefhn- 
ger and Gibbon argued that one class was sensitive 
only to contingency; the other was sensitive both to 
contingency and to absolute shock density. Inter- 
estingly, these two classes of animals could not be dis- 
tinguished in the left side of the figure, or in original 
training. As the authors noted, the distinctness of two 
classes of performance may have resulted partly from 
their stringent subject selection procedures. 


‘THE ROLE OF ‘TEMPORAL 
DISCRIMINATIONS IN BEHAVIOR ON 
SHOCK-DELAY PROCEDURES 


It is well established that subjects trained on a 
shock-delay schedule distribute their responses non- 
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randomly in time. The momentary probability of a 
response typically increases as time elapses during the 
shock-shock or response-shock interval (Anger, 1963; 
Sidman, 1966). Gibbon (1972) delineated some char- 
acteristics of the timing process needed to produce 
these response distributions, arguing that it has 
“scalar” properties, functioning with units propor- 
tional to the intervals being timed (such as response- 
shock intervals), Libby and Church (1974) provided 
additional evidence for the scalar property, showing 
with a free-operant shuttle response, that rats’ spacing 
of responses is proportional to the response-shock in- 
terval. Some interpretations of the conditioning proc- 
ess have been founded upon the observed response 
distributions. For example, Anger (1963) used it to 
enable two-factor theory to explain conditioning on 
uncued shock-delay procedures. He proposed that the 
animal’s internal stimuli correlated with times close 
to the response-shock interval become aversive through 
pairing with shock. When responses occur these were 
said to be replaced with other internal time-correlated 
stimuli that are not closely paired with shock. Re 
sponding, then, was said to be reinforced by removal 
of conditioned aversive stimuli whose temporal prop- 
erties are referenced to the response that produces the 
shock-delay, establishing a “time-zero.” 

Rescorla (1968) supported a two-factor interpreta- 
tion by demonstrating the potential for Pavlovian 
conditioning based on features of the shock-delay pro- 
cedure. These features involve the pairing of re- 
sponses and shocks in fixed time relations. He pre- 
trained his subjects) on a shock-delay procedure 
(SS = 10, RS = 30) and then gave independent train- 
ing with a Pavlovian trace conditioning procedure in 
which onset of a 5-sec tone preceded unavoidable 
shock in the same time relationship as the response- 
shock interval of the shock-delay procedure. That is, 
shock followed the offset of tone by 25 sec. Then, 
when he superimposed the tone upon the ongoing 
shock-delay procedure, he found response acceleration 
in phase with the 25-sec tone-shock interval, showing 
that the temporal features of the Pavlovian procedure 
affected responding maintained by shock-delay. Res- 
corla presented this as evidence for a mediating emo- 
tional response which would normally be conditioned 
to feedback stimuli from the shock-delay response. 

I have argued above that preshock stimuli are bet- 
ter characterized as discriminative stimuli, rather than 
as conditioned aversive or emotion-producing stim- 
uli. Dinsmoor and Sears (1973) provided such an ac- 
count, proposing a “fading trace” initiated by each 
response and denoting shock-free periods. ‘They sup- 
ported their interpretation with the demonstration, 
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illustrated above in Figure 21, of dimensional control 
by stimuli paired with shock-free short post-response 
times. 

By all three accounts—Anger, Rescorla, and Dins- 
moor and Sears—the animal’s spacing of responses in 
time reflects an underlying process fundamental to 
the reinforcement of responding on the shock-delay 
procedure. However, there is another likely interpre- 
tation of the spaced responding. A shock-delay pro- 
cedure, viewed as a procedure for reinforcement by 
shock-frequency reduction, provides differential rein- 
forcement of spaced responding. Within the response- 
shock interval, the more widely spaced the responses, 
the greater the magnitude of reinforcement for a 
given response. This view suggests that the production 
of spaced responding is a second-order process, not 
critical to the maintenance of responding per se, but 
affecting only the distribution of responses. Sidman 
(1966) supported this latter view with a variety of 
arguments. Some of his supporting data came from 
concurrent schedules; some came from demonstrations 
that temporal discriminations often do not appear 
until after responding is well established, 

Hineline and Herrnstein (1970) addressed the ques- 
tion with an experiment that eliminated the possible 
differential reinforcement based on shock-frequency 
reduction. The experiment used a modified version of 
the fixed-cycle shock-deletion procedure described 
earlier (part I of Figure 7). As before, brief shocks oc- 
curred every 20 sec, defining a 20-sec cycle. Also as in 
the free-operant fixed-cycle procedure, the first lever- 
press within a cycle cancelled the shock due at the end 
of that cycle. In addition, the lever was retracted from 
the chamber immediately after the response permit- 
ting only one response per cycle, and was re-extended 
to initiate the new cycle; a buzzer was correlated 
(positively or negatively for different animals) with 
presence of lever, also denoting the opportunity to re- 
spond. ‘Thus, each response eliminated only one 
shock, no matter where it occurred within the 20-sec 
cycle. The beginning of a cycle was cued either by a 
shock or by stimulus changes correlated with insertion 
of the lever into the chamber. Distributions of re- 
sponses within the cycle were compared to inter- 
response time distributions and shock-response time 
distributions of a conventional shock-delay procedure 
with shock-shock and response-shock intervals of 20 
seconds, 

As expected, animals on the shock-delay schedule 
frequently showed temporal discriminations, indicated 
by increasing probabilities of responding late in the 
response-shock interval. Animals on the fixed-cycle 
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schedule with limited response opportunity some- 
times showed greatest probability of response late in the 
cycle, indicating a timing process. But sometimes the 
probability of response decreased with time during 
the cycle, and sometimes it was fairly constant, indi- 
cating randomly spaced responding. These changes 
were not correlated with changes in number of shocks 
deleted. ‘Ihe changes in distribution of responses 
within a cycle were sometimes sudden, but more often 
were gradual and systematic. One of the widest and 
most systematic changes of response distribution is 
shown in Figure 22. In this figure, the slope of each 
function is the important feature. Positive slopes in- 
dicate timing, with the momentary probability of 
response increasing with time since trial (cycle) onset. 
A horizontal slope indicates randomly spaced re- 
sponding, with the momentary probability of response 
constant over time. This rat clearly showed slowly 
shifting response distributions, with continuity of per- 
formance from week to week, but with wide swings in 
the distribution of responses within the 20-see cycles. 
The distribution of responding was unrelated to pro- 
ficiency of performance, for the animal took fewer 
than one percent of the possible shocks in any session. 

The data of this experiment clearly revealed that 
processes producing spaced responding need not be 
critical to the maintenance of responding. This chal- 
lenges the two-factor interpretations, since the stimu. 
lus change initiating each cycle should have provided 
excellent cues for temporal conditioning. Dinsmoor 
and Sears’s interpretation of timing is perhaps least 
challenged by this experiment, given that a warning 
stimulus (presence of lever, and correlated buzzer) was 
present until the shock-deleting response had occurred, 
and the response produced the same feedback when- 
ever it occurred. However, their interpretation still 
would not predict the observed gradual changes in 
spacing of responses within a cycle, from timing to 
non-timing and back again. 

On the other hand, Sidman’s netion regarding the 
basis for spaced responding is not supported either, 
for the experiment shows that spaced responding need 
not be based on differential reinforcement through 
shock-frequency reduction. ‘The experiment does sup- 
port Sidman’s contention that spacing of responding 
is a second-order process, and that Pavlovian pairings 
with temporal stimuli or stimulus traces is not neces- 
sary to the maintenance of responding. As I described 
earlier, Neffinger and Gibbon (1975) used a procedure 
similar to that of Hineline and Herrnstein (1970), ex- 
cept that their lever did not retract when responses 
occurred, ‘They too found the distributions of respond- 
ing to be unrelated to proficiency of performance, and 
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concluded that timing is more collateral than causal 
in behavior on shock-delay procedures. 
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Fig. 2Z. Ghanges in conditional probability of response as a 
function of time since the beginning of a cycle (or tial) for 
one rat, The figure is based wpon sight sonsecutive weeks dur- 
ing conditioning with the fixed-cycle shock-deletion procedure, 
and limited response opportunity. The conditional probabilities 
were computed by dividing the number of responses in a 
2-sec class interval by the number of opportunities for response 
in that class interval. ‘The computation compensates for the 
fact that each response in a given class interval eliminated the 
opportunity for response intervals later in that cycle. To clarify 
the continuity of performance changes, plots for successive 
weeks, numbered at the right of the figure, are displaced up- 
ward, in consecutive order. The size of scale indicated on the 
ordinate is valid for all plots. Absolute values for the 0-2 sec 
class interval, for the successive weeks, were .32, .45, .76, .72, .70, 
55, 45, and .22, respectively. (From Hineline & Herrnstein, 1970. 
© 1970 by the Society for the Experimental Analysis of Be- 
havior, Inc.) 
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DISSOCIATION OF SHOCK-DELAY AND 
SHOCK-FREQUENCY REDUCTION 


Two variables have been confounded in nearly all 
procedures for negative reinforcement. Responding 
reduces overall shock-frequency or density, and at the 
same time it produces short-term delay of shock, a 
shock-free interval. Even the random-shock procedures 
of Herrnstein and Hineline (1966) and of de Villiers 
(1972, 1974) can be described in terms of average time 
between response and shock, contrasted with time be- 
tween shocks in the absence of responding. Such a 
translation is mathematically tautological, in that av- 
erage delay is merely the reciprocal of shock fre- 
quency. Still, it is also a change in emphasis from 
molar, long-term controlling relations to short-term 
controlling relations, 

In the experiment described just above, the molar 
and molecular féaturés of behavior were dissociable 
in that the molar features—degree of shock-frequency 
reduction, Or probability of response per trial—were 
stable while the microstructures of performance were 
highly unstable. The experiments that follow focus 
on procedural dissociation of the molecular and molar 
aspécts of negative reinforcement, 

In these experiments the opportunity to respond 
was again limited, to permit precise control of the 
delay between response and shock with no interyen- 
ing responses to complicate the relationship. Delivery 
of a shock at times between response opportunities 
permitted the shock frequency to be manipulated 
somewhat independently of short-term delay. The first 
such procedure (Hineline, 1970) provided for re- 
sponses to produce shock-delay without affecting over- 
all shock frequency. As shown in part I of Figure 23, 
the procedure was based on fixed 20-sec cycles. Time 
specifications within a cycle began with insertion of 
the lever into the chamber. In the absence of respond- 
ing the lever was accessible for alternate 10-sec peri- 
ods, with a shock occurring at the eighth second of the 
lever’s presence. A response prior to that point pro- 
duced immediate removal of the lever and delayed the 
shock from the 8th to the 18th second of the cycle. 
There was still one shock per 20 sec. This procedure 
maintained responding in each of five rats exposed to 
it, two of which had prior training with a conven- 
tional cued shock-delay procedure, and three of which 
were experimentally naive. 

These results are compared with a second experi- 
ment, using the procedure diagrammed in part II of 
Figure 23. In the absence of responding this pro- 
cedure was identical to the one just described: 20-sec 
repeated cycles began with insertion of the lever into 
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the chamber; a shock occurred at the eighth second, 
and the lever retracted at the 10th second for the 
remainder of the cycle. As before, a response removed 
the lever and delayed the shock. However, this was 
accomplished by production of a 10-sec interval with 
the lever out, and a shock occurring 8 sec after the 
response. ‘There was still one shock per cycle, but 
responses shortened the cycle, increasing overall shock 
frequency. Response rates of the two rats with previ- 
ous training decreased systematically to negligible 
levels, within five sessions and nine sessions respec- 
tively. Eleven naive rats placed directly on this pro- 
cedure never responded on more than 30 percent of 
the cycles. Typically, responding rose to 20 or 25 per- 
cent of the cycles during a few of the early sessions, 
and then systematically decreased to zero. 

The first of these two experiments indicates that 
short-term shock-delay can function as a reinforcer 
independently of overall shock frequency. The pro- 
cedure used a variable response-shock interval (10 to 
I8 sec, depending on the placing of the response 
within a cycle), and constant shock frequency. In the 
second experiment, apparently shock-delay was not 
sufhciently potent to override an opposing change in 
shock frequency. This procedure provided a shorter, 
constant response-shock interval (8 sec), as well as in- 
creased shock frequency contingent upon responding. 
A complicating feature of both procedures was the 
delivery of shocks sometimes in the presence of the 
lever and sometimes in its absence. While shock fre- 
quencies in the presence and in the absence of the 
lever (cach accumulated over all cycles) did not indi- 
cate this as the source of response strength,* another 
set of procedures was devised to eliminate possible 
confounding effects of these relative shock frequencies. 

The revised procedure (Hineline, 1969) permitted 
complete dissociation of the opportunity to respond 
from the situation in which shocks occur. It also per- 
mitted wider ranges of shock-delay and shock fre- 
quency. As shown in Part IIT of Figure 28, all variants 
of the procedure were identical if no response oc- 
curred. That sequence of events is portrayed just 
above the dotted line in the figure: a cycle began 
with insertion of the lever into the chamber; it re- 
tracted at the 10th sec, a shock (1.0 sec; 0.8 mA) was 


4Two of the three naive animals placed on the constant 
frequency procedures persistently responded so as to produce 
substantially higher shock rates in the absence of the lever than 
in its presence. Herrnstein (1969) showed that the principle of 
shock-frequency reduction under stimulus control can account 
for this experiment if the computation of shock frequencies is 
based only on stimulus exposures (presence or absence of lever) 
during which shock occurred. 
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delivered at the 11th second, and a new cycle began 
with reinsertion of the lever at the 60th second. 

A sequence of procedures was run; each provided a 
different consequence for responding. ‘The first was a 
shock-deletion procedure. If a response occurred, the 
lever retracted immediately and no shock was deliv- 
ered on that cycle. ‘he second was an extinction pro- 
cedure. If a response occurred, the lever retracted im- 
mediately, but the shock was still delivered at the 11th 
sec. The third was a shock-delay procedure with con- 
stant overall shock frequency. If a response occurred, 
the lever retracted immediately and the shock was 
delayed from the 11th to the 39th sec. This was fol- 
lowed by shock-delay procedures where the delay was 


ee 
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order to isolate shocek-dalay 
froin shock-frequency. Time is 
represented lincarly from left to 
right, as indicated on the top 
line of each diagram. Upward 
displacement of a line indicates 
insertion of the lever at the be- 
ginning of a cycle. Downward 
displacement indicates retrac- 
tion of the lever. An “S” marks 
the delivery of a shock; an “R” 
indicates the occurrence of a 


SSS S response. (From Hineline, 1969, 
SSSSS 1970.) 


achieved at the expense of increases in overall shock 
frequency. These procedures were identical to the 
third procedure, except that one, two, three, or four 
additional shocks were delivered in cycles where the 
response occurred. For two shocks per response, the 
shocks occurred at the 39th and 47th secs of the cycle. 
‘The larger numbers of shocks were added with the 
spacing indicated at the bottom of the diagram, filling 
in the interval between the 39th and 47th secs. 

Fach rat was first trained until stable on the first, 
shock-deletion procedure. Next, each rat was placed 
on the second (extinction) procedure until responding 
decreased to less than 5 percent of the trials. ‘The 
third procedure, shock-delay, was instituted next; if a 
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rat failed to respond on this he was reconditioned on 
the shock-deletion procedure and then moved to the 
shock-delay procedure. If a rat developed stable re- 
sponding on the third procedure, the final sequence 
was instituted, providing two, three, four, and then 
five shocks per response. 

One of the six rats responded on over 90% of the 
cycles on the initial, shock-deletion procedure. The 
others stabilized at approximately 80%, 75%, 70%, 
20%, and one animal seldom responded at all, even 
with hand-shaping. All animals’ responding ceased 
within a very few sessions of exposure to the second 
procedure where a shock occurred at the 11th second 
independently of responding. Only two animals re- 
sumed responding when transferred directly from this 
to the shock-delay procedure, the two that had re- 
sponded most on the initial, shock-deletion procedure. 
The others failed to respond on the third procedure 
even after reconditioning with shock deletion. These 
two were then run on procedures in which responses 
delayed shock but increased shock frequency. As the 
number of shocks per response was increased every 
five sessions, from two, three, four, and then to five 
shocks per response, they continued to respond on 
nearly all cycles. 

Thus, in some animals, delay of shock maintained 
responding even when it produced a fivefold increase 
in overall shock frequency. However, these animals 
had prior training on a shock-deletion procedure, and 
although their responding had subsequently been ex- 
tinguished with a noncontingent shock procedure, the 
results should be interpreted with some caution. 
Response-contingent shocks, and sometimes even non- 
contingent shocks, can maintain responding in an- 
imals with histories of negative reinforcement (e.g. 
McKearney, 1972; Powell, 1972). 

In order to be sure that prior negative reinforce- 
ment based on combined shock-delay and_ shock- 
frequency reduction would not contaminate the 
dissociation of these two variables, the initial, shock- 


RESPONSES 


+1 a 
SESSIONS 


NEGATIVE REINFORCEMENT AND AVOIDANCE 


deletion phase has been eliminated from subsequent 
experiments with the same set of procedures with 
60-second cycles. ‘These experiments have encountered 
difficulties of conditioning that have also sometimes 
been reported with more conventional negative rein- 
forcement procedures. Hence, while the following, 
rather autobiographical series will tell something 
about the contributions of shock-delay and_ shock- 
frequency reduction to negative reinforcement, it will 
also illustrate strategies that have been used to deal 
with difficulties of initial conditioning with negative 
reinforcement. This will prepare the way for a discus- 
sion of special considerations regarding initial condi- 
tioning with negative reinforcement. 

In an unpublished experiment, twelve naive rats 
were placed on the procedure diagrammed by the 
bottom line of Figure 23. The only response conse- 
quences were lever-removal and shock-delay. None of 
the twelve acquired stable lever-pressing. Hand-shap- 
ing with some animals, by making lever-removal and 
shock-delay contingent upon successively closer move- 
ments toward the lever, resulted in their hovering 
over the lever and spending most of their time near it, 
but did not produce reliable lever-pressing. The an- 
imals often tended to remain motionless during the 
10-sec opportunity period, moving around at other 
times. 

In light of these results, supplementary positive 
reinforcement was used, in a manner similar to that 
used by Riess (1970) with a conventional shock-delay 
procedure, and by Giulian and Schmaltz (1973) with 
a discrete-trial, cued shock-deletion procedure, to 
bring the animals’ behavior into contact with the 
delay contingency without a history of shock reduc- 
tion. In collaboration with G. D. Smith, several vari- 
ants of this approach were used on a few animals with 
moderate success. One is described here; it demon- 
strated the clear and reliable, but weak reinforcing 
effects of delay-of-shock with constant shock frequency. 
The sequence of procedures and results is shown in 
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Fig. 24. Lever-presses per ses- 
sion for rat DS-6, on a sequence 
of procedures: A) Responses 
produce food with no shocks 
delivered; B) Responses produce 
food and delay of shock; C) 
Same as B, but free access to food 
in home cage; D) Discontinue 
positive reinforcement; responses 
produce shock delay only; E) 
One shock delivered at the 16th 
sec Of each cycle; F) One shock 
delivered at the 47th sec of 
each cycle. 
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Figure 24. Prior to training the rats were food- 
deprived; then lever-pressing was established and 
maintained with delivery of sweetened condensed 
milk as reinforcer (part A of Figure 24). Then a 
slightly modified version of the shock-delay procedure 
described above (part III of Figure 23) was added, 
with shock intensities increased by increments over 
sessions (part B of Figure 24). This shock-delay pro- 
cedure differed from that diagrammed in Figure 23, in 
that: with no response the shock occurred at the 16th 
instead of the 11th sec; responding delayed the shock 
until the 47th sec; whenever it occurred, shock was 
preceded by a three-second tone. These changes were 
made to minimize response-opposing effects of signaled 
versus unsignaled shocks (e.g. Badia et al., 1973), since 
shock immediately preceded by lever removal could 
be said to be signaled shock, while the delayed shock 
could not. When shock intensities were increased to 
1.0 mA, a level that supports lever-press responding 
on conventional shock-delay procedures, food depriva- 
tion was discontinued by giving free access to lab 
chow in the home cage, but delivery of condensed 
milk contingent upon lever-presses was continued 
(part C in Figure 24). When body weights had recov- 
ered to normal free-feeding levels, the positive rein- 
forcement was discontinued (condition D in Figure 
24), and when performance was stabilized to the point 
of showing no consistent trends, two control or ¢xtinc- 
tion procedures were used to insure that indeed it was 
shock delay that maintained the responding. First, 
(condition E) all shocks were delivered at the 14th 
sec, independently of responding. This procedure was 
comparable to the extinction procedure of the pre- 
ceding experiment, and the results were similar; re- 
sponding dropped quickly to negligible levels. Rein- 
statement of the shock-delay procedure produced a 
slow, systematic, but variable reacquisition over some 
40 sessions. In the next control procedure (condition 
F), all the shocks were delivered noncontingently at 
the delayed position, the 47th sec of the cycle. On this 
procedure responding slowly but consistently declined 
toward zero, until reversed by a return to the shock- 
delay procedure which again produced, slow, variable, 
but systematic reacquisition. ‘Thus, in an animal with 
no history of response-produced shock deletion, delay 
of shock appeared to be an unconfounded but fairly 
weak reinforcer. Several animals on this procedure 
did not persist in responding when the shock intensi- 
ties were raised; others responded at lower levels when 
the positive reinforcement was discontinued. 

It is clear from the procedural diagrams of Figure 
23 that the fixed-cycle procedure separating the oppor- 
tunity to respond from the periods when shocks occur 
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permits complete independence of shock-delay and 
shock-frequency changes made contingent upon re- 
sponding. ‘This flexibility was used in a recent experi- 
ment by Lambert, Bersh, Hineline, and Smith (1973), 
pitting shock-frequency reduction against shock delay, 
with a different part of the range of time values per- 
mitted by a 60-sec cycle. In the absence of responding 
the lever was available for 10 sec, and then five brief 
shocks were delivered, one per second, between the 
llth and 15th sec. A response during the first 10 sec 
retracted the lever and deleted the five later shocks, 
but produced one immediate shock. Thus, responses 
could reduce shock frequency at the expense of reduc- 
ing shock delay to zero. ‘Two rats placed directly on 
this procedure initially responded on approximately 
60 percent of the cycles for the first few sessions; then 
their responding dropped systematically to zero by 
session 13. Two other animals, trained with a pro- 
cedure where a response could terminate the stream 
of five shocks, also ceased responding on the procedure 
permitting only shock frequency reduction with con- 
tingent zero-delay shock. Thus, shock frequency re- 
duction also appeared to be a weak reinforcer when 
pitted against a reduction in shock-delay. 

Perhaps the weakness of these reinforcing effects 
should be interpreted in light of a notion advanced 
by Jenkins (1970), with reference to limited-opportu- 
nity positive reinforcement precedures. It is a netien 
very similar to the one I advanced earlier, regarding 


_the effects of cues on the averaging of shocks over 


time: 


The possibility exists that distinctively differ- 
ent occasions for nonreinforcement and rein- 
forcement confine or localize the effect of rein- 
forcement while more similar occasions extend 
the interval over which delayed reinforcement 
supports a prior response. (Jenkins, 1970, p. 105) 


To the extent that this is true, confining the response 
Opportunity to a clearly discriminable situation dis- 
tinct from the effects of the response weakens the rein- 
forcing effect of either short-term delay or shock- 
frequency reduction. 

The reinforcing effect may be weak for another 
reason, however. Acquisition of lever-pressing with 
conventional shock-delay procedures is most easily 
accomplished with short shock-shock intervals. Per- 
haps the change to long cycle length, attractive be- 
cause it permits longer delays, is relatively ineffective 
because it imposes a low basal shock rate. If the con- 
ventional shock-delay procedures, providing both 
shock-delay and shock-frequency reduction within the 
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situation where the response occurs, are ineffective for 
producing initial acquisition of lever-pressing with 
similar long intervals, then the weak effects in Fig- 
ure 24, and in the work by Lambert et al. (1973) are 
not surprising. 

One strategy for studying weak reinforcing effects 
is to use a response more easily conditioned than the 
lever-press. A likely candidate is a shuttle response, 
which has been shown to be readily strengthened 
with shock reduction (Bolles, 1970; Riess, 1971; Riess 
& Farrar, 1972). In unpublished work, J. Harman and 
I have recently verified the usefulness of this approach 
by subjecting four rats to a shock-delay procedure 
with a shock-shock interval of 60 sec, and an RS in- 
terval of 120 sec. With a free-operant shuttle response, 
all animals quickly and consistently reduced their 
shock frequencies to less than one per 10 min. Lam- 
bert et al. (1973), using the shock-frequency reduction 
procedure with nondelayed shock described above, 
found a shuttle response to be maintained slightly 
more easily than lever-pressing. In an additional un- 
published experiment, J. Harman and I used a shut- 
tle response in conjunction with the shock-delay pro- 
cedure diagrammed at the bottom of Figure 23. 
Moderate to low response rates were obtained, com- 
parable to those of the latter parts of Figure 24. The 
shuttle response did not require the addition of posi- 
tive reinforcement early in training to produce these 
effects. “Thus again, shock-delay and shock-frequency 
had independently isolable but weak reinforcing 
effects when response opportunity was discriminably 
separated from the situation in which shocks oc- 
curred. 

Earlier in this chapter we saw that in the case of 
weak reinforcing effects of timeout from the shock- 
delay procedures, a two-response or choice situation 
provided more sensitive measures than did measures 
based on maintenance ofa single response, Further, in 
positive reinforcement studies of reinforcement mag- 
nitude and frequency, choice procedures provided by 
concurrent chains have also proven especially sensi- 
tive. It may be that choice procedure would provide 
an effective strategy for the isolation of shock-fre- 
quency reduction and short-term shock delay as well, 
permitting systematic assessment of the relative po- 
tency of differing magnitudes of each. 


CONSIDERATIONS REGARDING 
INITIAL ACQUISITION 


Operant conditioners have tended to emphasize 
maintenance of responding and of response patterns, 
rather than initial acquisition of control over particu- 
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lar responses by reinforcement. With positive rein- 
forcement, conceptual concern over initial acquisition 
has been allayed by appeal to the principle of shap- 
ing. ‘This principle, of response selection through 
reinforcement of successive approximations to a spe- 
cified response, was convincingly demonstrated in 
Skinner’s early work, and is routinely replicated in 
undergraduate laboratories. Over the years the as- 
sumed generality of this principle has been occasion- 
ally challenged (e.g. Breland & Breland, 1961), but 
only recently has that challenge become a central 
topic supported by systematic work on autoshaping 
(Brown & Jenkins, 1968) and other response-selection 
effects (e.g. Staddon & Simmelhag, 1971). 

The present chapter has also emphasized mainte- 
nance rather than acquisition, partly because an in- 
tegrative account of negative reinforcement analogous 
to that of positive reinforcement seems needed at this 
time, Also, as argued by Morse and Kelleher (1966), it 
appears that once control over a response with nega- 
tive reinforcement is achieved, the effects of schedul- 
ing negative reinforcement are much like those of 
scheduling positive reinforcement. However, initial 
acquisition with negative reinforcement has consis- 
tently been of more concern than with positive, 
probably because using the former, experimenters 
have often encountered difficulty in producing a chosen 
response. Ihe difficulties have tended to arise in rela- 
tion to particular species, and to particular responses. 


Species Differences in Acquisition 


According to informal laboratory lore, “‘monkeys 
and dogs avoid better than rats do,” and until very 
recent years the behavior of pigeons seemed virtually 
immune to negative reinforcement based on shock 
reduction. However, it is impossible to deduce exactly 
comparable situations for different species and choose 
apparatus configurations, shock intensities, and cue 
intensities or modalities on the basis of the subjects’ 
differing shapes and receptors. Hence, it is difficult to 
devise valid cross-species comparisons. The case of 
pigeons is especially instructive in this regard. 

In view of several unpublished, unsuccessful at- 
tempts to produce or maintain key-pecking with 
shock-delay or shock-reduction, Rachlin and Hineline 
(1967) devised a shaping procedure based on termina- 
tion of trains of shock pulses with slowly increasing 
intensity. They successfully, if arduously, shaped the 
key-peck response and maintained repetitive respond- 
ing within a limited range of fixed-ratio and fixed- 
interval schedules (Hineline & Rachlin, 1969a, b). More 
recently, Ferrari, ‘Todorov, and Graeff (1973) devel- 
oped these shaping procedures further and maintained 
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stable key-peck responding with a shock-delay pro- 
cedure. Using an SS interval of two seconds, manipula- 
tions of the RS interval resulted in performance 
changes similar to those observed by Sidman with rats 
(Sidman, 1953). However, they too reported the need 
for hours of tedious shaping during the initial train- 
ing. Schwartz and Coulter (1971) found that key- 
pecking established with positive reinforcement did 
not readily transfer to negative reinforcement. How- 
ever, more recently, Lewis, Lewin, Stoyak, and Mueh- 
leisen (1974) were able to transfer control of key- 
pecking from positive reinforcement to negative 
reinforcement based on termination of a cue and 
deletion of frequent irregularly spaced shocks, and 
Foree and LoLordo (1974) succeeded in a similar 
transfer, to a shock-delay procedure with added cues. 

It is clear, however, that with the pigeon, responses 
other than key-pecking are more easily established 
with negative reinforcement, Hoffman and Fleshler 
(1959) found a head-lifting response to be slightly 
more susceptible to shock-reduction, and MacPhail 
(1968) obtained reliable, high-probability shuttle re- 
sponding with pigeons on a cucd shock-deletion pro- 
cedure. Later, Smith and Keller (1970) and Foree and 
LoLordo (1970) easily accomplished conditioning with 
free-operant shock-delay procedures contingent upon 
a foot-treadle response. Klein and Rilling (1972) have 
built upon this by manipulating the parameters of 
shock-delay and of shock intensity, replicating in con- 
siderable detail the results obtained with the lever- 
press in rats (Boren, Sidman, & Herrnstein, 1959, 
Sidman, 1953), As described above, Dinsmoor and 
Sears (1973) used the same schedule and response to 
demonstrate dimensional control by response-produced 
stimuli, and Klein and Rilling (1974) also used the 
treadle-press in pigeons to study generalization effects 
with tones accompanying the shock-delay component 
ofa multiple schedule, 

This brief history shows that some aspects of ob- 
served species differences in behavioral processes origi- 
nate as much in the method used for examining them, 
as in any differing capacities of different spccics. In 
addition, it points up the importance of response spe- 
cification for ease of conditioning, a consideration 
that has had considerable impact on recent interpreta- 
tions of negative reinforcement. 


Response Differences: Apparent Susceptibility 


to Reinforcement 


I just reviewed evidence that in pigeons treadle- 
press or shuttle responses have been more easily 
brought under control of negative reinforcement than 
has the key-peck, which is so readily controlled by 
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positive reinforcement. Earlier, I noted comparable 
observations by Riess and Farrar (1972) that in rats a 
shuttling response is usually acquired with negative 
reinforcement more rapidly and reliably than is a re- 
petitive lever-press response. Further along similar 
lines, Keehn (1967) observed that the response of 
running in a wheel was acquired more quickly with a 
shock-delay contingency than was a lever-press re- 
sponse. Bolles (1970) found differences in acquisition 
on a shock-deletion procedure, when comparing the 
responses of running in a wheel, turning, or rearing. 

Bolles (1970) has discussed these differences in rela- 
tion to “species-specific defense reactions.’ He noted 
that in the genetic histories of small animals, random 
“trial-and-error” patterns of behavior must have been 
maladaptive in aversive situations or situations ac- 
companied by intense or novel stimulation. While the 
first inappropriate response producing loss of feed 
leaves an animal still hungry, an inappropriate re: 
sponse in an aversive situation may very well leave 
the animal dead. Bolles proposed that in such situa- 
tions the responses of freezing, fighting, or fleecing 
would be most generally adaptive, so animals that 
tended to behave in these ways in these situations 
would tend to survive and reproduce, In their off- 
spring, these behavior patterns could be expected to 
predominate in aversive and novel situations. This 
line of reasoning can lead to twe interpretations that 
are not mutually exclusive. One proposes that par: 
ticular responses are especially associable or eondi- 
tionable with a particular stimuli. The other analyzes 
the differences in terms of the operant levels of vari- 
ous responses iN aversive situations, Bolles has clearly 
implied that the former is more important, arguing 
that negative reinforcement is basically selection from 
among species-specific defense reactions: other re- 
sponses are deemed uncénditionable, From this yigw- 
point. the conditioning of a particular response with 
negative reinforcement can only be achievéd to the 
cxtent that the response is part of a species-specific 
defense reaction. Seligman and Hager (1972), develop- 
ing an earlier discussion by Seligman (1970), have 
opted for a similar interpretation, They used a con- 
cept of “preparedness” to make explicit the proposed 
susceptibility of a response to the effects of (or in their 
terminology, the associability of a response to) a par- 
ticular reinforcer. 

Both of these interpretations relate well to the ap- 
parent differences of conditionability described above. 
They also account for Bindra and Anchel’s (1963) 
demonstration that immobility can readily be condi- 
tioned with a shock-removal contingency, and the 
observation by Keehn (1967) that lever-holding is 
more readily produced and maintained with negative 
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reinforcement than is repeated lever-pressing. Both 
immobility and lever-holding can be components of 
the species-specific response of freezing. ‘The first in- 
terpretation outlined above is especially supported by 
an experiment by Foree and LoLordo (1973). ‘They 
found that after a combined light and tone were used 
as a cue in an appetitive situation the light had 
stronger effects on behavior, whereas after the same 
light and tone were used as a cue in a negative rein- 
forcement procedure the tone had stronger effects on 
behavior. 

However, attributing species differences in cond- 
tioning to species differences in associativity or sus- 
ceptibility to reinforcement does have some drawbacks. 
First, as Schwartz (1974) has pointed out, any inde- 
pendent assessment of such “preparedness” of a re- 
sponse must be made with existing conditioning 
procedures. Different procedures will give differing re- 
sults, and any attempt to standardize the measures of 
“preparedness” will necessarily standardize a set of 
“fundamental” conditioning procedures. This would 
be counter to current critical examinations of com- 
mon conditioning concepts, and would impose an 
arbitrariness more difficult to assess than that of the 
“arbitrary response.” 

Second, the “species-specific” label draws our atten- 
tion away from the fact that these reactions and their 
conditionability, however common and stereotyped 
among members of a species, have ontogenetic as well 
as genetic origins. J wonder, for example, about the 
degree to which the conditionability of a rat’s run- 
ning in an aversive situation originated in its interac- 
tions with littermates during the early weeks of life. 
The label “‘species-typical” would be more appropri- 
ate. This term would remind us that we have not 
closed the question of the origins of an animal’s be- 
havioral characteristics, while still suggesting that we 
look to genetic factors for part of the answer. 

Third, in appealing to genetic determination one 
appeals to a largely unspecified sequence of events, ‘To 
be sure, one can breed animals that tend to respond 
in this way or that, but this only establishes the 
plausibility of the explanation. To the extent that 
species-typical reactions are specified and studied inde- 
pendently, with adequate identification of the circum- 
stances under which they will occur, their use is easily 
justified. However the genetic explanations tend to be 
used in a post hoc fashion when all else has failed. 
The independent specification usually is not available, 
and one can thus choose convenient properties for the 
genetically determined behavior. More directly test- 
able explanations are sometimes available in which 
the relative ease of conditioning of a particular re- 
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sponse is explained in terms of attributes of the 
response itself. For example, Keeln (1967) noted that 
the relatively slow acquisition of lever-press respond- 
ing, compared to wheel running, may occur because 
in the latter a continued response pattern is permit- 
ted, while the former requires repeated production 
and discontinuance of a response pattern. He com- 
pared acquisition of running with acquisition of lever- 
holding, and found these two responses equally con- 
ditionable with negative reinforcement. 

Finally, as Ferrari, Todorov, and Graeff (1973) have 
pointed out, any comparison of conditionability or 
associability of different responses must be made on 
the basis of optimal procedures for each response. 
This brings us inevitably to the other aspect of Bolles’s 
formulation, the role of shock-produced behavior in 
determining the operant levels of different responses. 
Optimal procedures must accommodate or eliminate 
this behavior. 


COMPETING, SHOCK-PRODUCED BEHAVIOR 


Any intense, novel, or aversive stimulus is likely to 
produce behavior that helps to determine the reper- 
toire of responses available for reinforcement. This is 
especially relevant in negative reimforcement pro- 
cedures since, by definition, such stimulation must be 
present, highly probable, or impending when the to- 
be-reinforced response is emitted. This contrasts with 
positive reinforcement situations, where the critical 
stimulus is presented after the response is emitted. 
Effects of noncontingent aversive stimulation have 
been reviewed by Myer (1971), who abstracted some 
useful principles. He noted, for example, that prox- 
imally received stimuli usually produce skeletal be- 
havior that tends to remove the stimulation. Distally 
received stimuli—in themselves aversive or paired with 
aversive stimuli—tend to produce cessation of move- 
ment. 

Thus electric shock, while providing the basis for 
reinforcement, produces behavior patterns that must 
interact with the responses that experimenters have 
chosen to reinforce. Sometimes facilitation occurs, but 
often competition or disruption occurs instead. Specific 
competing patterns have been described and _ dis- 
cussed by experimenters who had difficulty in condi- 
tioning particular responses with negative reinforce- 
ment. For example, a number of experimenters have 
observed shock-produced freezing and _ lever-holding 
(e.g. Dinsmoor, 1967; Feldman & Bremner, 1963; 
Keehn, 1967). Smith, Gustavson, and Gregor (1972) 
used high-speed photography to examine the pigeon’s 
response to unsignaled shock, and found that the 
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shock produced head-flexions—movements in the di- 
rection opposite to that required for key-pecking. 
Shock-elicited aggressive patterns such as biting the 
manipulandum have also been observed and measured 
(Azrin, Hutchinson and Hake, 1967; Azrin, Rubin 
and Hutchinson, 1968; Pear, Moody and Persinger, 
1972; Powell, 1972). The use of intermittent shocks or 
cues correlated with these shocks may not eliminate 
competing behavior, for as Hoffman (1966) and Myer 
(1971) have noted, the preshock warning stimulus 
commonly used in shock-deletion procedures fits a 
paradigm for conditioned suppression of active skele- 
tal behavior. As Estes and Skinner (1941) demon- 
strated, appetitively maintained behavior is suppressed 
in the presence of a stimulus that has been paired 
with shock. While after initial conditioning has been 
achieved with negative reinforcement, such a stimulus 
may enhance active responding (e.g. Rescorla, 1968), 
its effect during initial conditioning is to produce 
crouching and freezing. 

Each of these shock-produced behavior patterns 
has been seen as interfering with initial attempts to 
bring lever-press responding under control of nega- 
tive reinforcement. J have also found evidence that in 
rats the disruption persists beyond initial condition- 
ing (Hineline, 1966). The persisting disruption is 
revealed in a commonly-observed “warm-up” effect, 
whereby even after they have achieved proficient per- 
formance, rats continue to take many shocks early in 
sessions. In this experiment rats were conditioned 
with a shock-delay procedure (SS = RS = 20). Direct 
observation of subjects showing pronounced warm-u 
revealed that patterns of freezing and lever-holding 
were Clearly evident early in the sessions, when most 
shocks were received, but not later in sessions. In a 
subsequent set of procedures, the possible disruptive 
effects were examined indirectly, by making positive 
reinforcement and shock-delay both contingent upon 
presses of the shock-delay lever. ‘The schedule of posi- 
tive reinforcement (VI 40 sec), when used alone prior 
to shock-delay training, produced performances with 
few interresponse times exceeding 20 sec until satia- 
tion late in the sessions. Thus, if there were no disrup- 
tion virtually all early session shocks should have been 
eliminated by the addition of positive reinforcement. 
For rats with extensive prior training the added posi- 
tive reinforcement failed to eliminate the dispropor- 
tionately high shock frequencies early in sessions. 
Animals with less extensive training showed some 
reduction of warm-up with the added positive rein- 
forcement. However, four of five animals continued 
to show at least some warm-up, indicating transient 
disruption of lever-pressing. To the extent that these 
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results implicate competing behavior as disrupting 
negatively reinforced responding early in all sessions, 
competing responses are relevant to maintenance 
of the conditioned response, as well as to its initial 
acquisition. Experimenters have often tacitly acknowl- 
edged this by discarding data from the initial parts of 
all sessions, even in studies of long-term maintenance. 


NEGATIVE REINFORCEMENT AND THE 
ONGOING BEHAVIORAL STREAM 


Whichever aspect of the shock-produced competing 
responses one chooses to stress, one must come to grips 
with their ontogenetic as well as their phylogenetic 
origin. Furthermore, since they affect the operant 
level, or “behavioral stream” (Schoenfeld, 1969) upon 
which negative reinforcement must Operate, one must 
consider specifically how they are maintaincd in a 
given situation. 

Of course, operant conditioning principles partly 
deseribe the shaping of the behavioral stream. These 
principles can be used to explicitly study the main- 
tenance of the “competing behavior.” For example, 
Keehn and Walsh (1970) have examined reinforce- 
ment of leyer-holding, and Bolles and Riley (1973) 
have carefully examined reinforcement and punish- 
ment, as well as elicitation of a freezing résponsé. In 
some cases opcrant principles have been used to spe- 
cifically eliminate the competing behavior. For ex- 
ample, Forgione (1970) improved the acquisition of 
repeated lever-pressing by identifying and eliminating 
reinforcement that inadvertently had been made con- 
tingent upon short-latency post-shock responding. 
Feldman and Bremner (1963) eliminated freezing and 
lever-holding by making brief shock contingent upon 
these responses, while making shock-delay contingent 
upon repeated pressing, 

Initial training procedures for bringing a particu. 
lar response under control of negative reinforcement 
are “little transfer experiments.” The result of one 
conditioning procedure produces the operant level for 
the next procedure to act upon. Therefore it is not 
surprising that the effects of negative reinforcement 
are very different once an organism’s behavior has 
been brought under control of some negative rein- 
forcement procedure. The situation is analogous to 
the teaching of swimming. Once one swimming tech- 
nique is brought under control of water-filled situa- 
tions, which involves eliminating many incompatible 
water-produced responses, it is relatively easy to train 
any of a variety of swimming patterns. 

However, some components of the behavioral stream 
clearly are produced in ways not subsumed under 
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principles of reinforcement and punishment. Usually 
the species-typical patterns have been characterized as 
elicited even though they may include complex re- 
sponse patterns. Elicitation plays a role, as Bolles and 
Riley (1973) have demonstrated, but it is somewhat 
inadequate in that it implies a passive organism that 
does nothing except when goaded. The ethologists’ 
concept of “releasing stimuli” for particular classes of 
behavior is only slightly less limiting. It more readily 
encompasses complex behavior patterns, but still 
tends to characterize the controlling environment as 
made up of discrete, single stimullt. 

To be sure, single stimuli are important, but an- 
imals are also sensitive to temporal configurations of 
environmental objects and events. Temporal patterns 
of environmental cvents interact with patterns of on- 
going behavior. Perhaps we shall come to deal with 
these interactions in terms of modulation (Gibbon & 
O'Connell, 1975; Morse & Kelleher, 1970), periodicity 
and apériodicity (Kadden, 1973), autocorrelation, 
closed and open loops, and/or contingency and non- 
contingency as expressed in a statistical, rather than a 
merely associative sense (Baum, 1973). Goncepts such 
as these may help us understand how some procedures 
that were designed to reinforce a specific response are 
actually more effective for producing behavior that 
competes with that response. It is not yet clear whether 
these developing concepts will come to subsume what 
we now call reinforced as well as what we call non- 
reinforced behavior. Perhaps the operations we now 
treat as discrete and unitary will be of interest mainly 
as ends of continua such as those suggested by Catania 
(1971). Whatever reformulations occur, they must also 
deal with interrelations described here, between re- 
sponding and the response-contingent reduction of 
stimulation, 
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By-Products of Aversive 


INTRODUCTION 


The concept of aversive control is familiar to all 
psychologists who work with operant methods, and 
commonly refers to the conditioning procedures of 
escapé, avoidance, and punishment. Under such condi- 
tions, it has often been found that secondary or “by- 
product” performances are generated. This chapter 
discusses some of the secondary effects of aversive con- 
trol and describes how these may be functionally 
related to aversive control procedures. 

‘The concept of aversive control connotes the apphi- 
cation of aversive stimuli in a manner such that its 
consequences affect performance. Aversiveness is as- 
sessed by the capacity of a stimuli to support responses 
which eliminate or reduce such stimulation, or alter- 
natively by its capacity to suppress performances 
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Maintained by other stimuli. Thus, avérsivé stimuli 
ars olten relerred te as negative reinforcers or punish- 
ers. It is in such a context that the concept of bys 
products of aversive control has developed. When a 
stimulus is thought ef as having preperties defined by 
its Capacity to alter behavior through a contingent 
rélation with that behavior, simultaneous influences 
upon behavioral processes other than by contingent 
influence have typically been discussed as by-product, 
secondary, or adjunctive influences of the stimulus. 
Yet attention to the contingent control of certain en- 
vironmental stimuli, is not a sufficient reason to con- 
sider other performances or effects which are produced 
simultaneously and non-contingently by such stimuli, 
to be secondary in nature. Only recently has it become 
understood that aversive stimuli may produce com- 
plex chains of reactions directly, and that though 
these performances may be seen in environments 
where the contingent control of some behavior is be- 
ing studied, such direct effects are also present in 
environments where no contingency arrangements are 
present. 

Recent work shows that aversive stimulation pro- 
duces complex, highly coordinated performance se- 
quences in a wide range of species. ‘These performances 
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are reliably produced and constant during extended 
periods of observation. Though such behaviors may 
be modified in learning paradigms, their expression 
is fundamentally dependent neither upon historical 
nor contemporary response-contingent environmental 
events. Recent findings in our laboratory suggest that 
such reactions produced by aversive stimuli may par- 
tially form the basis for the complex performances 
which often result from the application of response- 
contingent procedures. 


METHODS 


The methods cmployed for the work described in 
the piésent chapter, are in the main, described in 
detail in carlicr publications. It may be useful. how- 
avey, t5 déscribe several experimental guidelines 
which have emerged over a number of different stud- 
jes, designed I6 Meacure bahavisral sequeéneés not usu- 
ally undergoing response-contingent control. 

By the very nature of aversive stimulation, vigorous 
attempts by subjects to escape or avoid Lull or direct 
stimulus contact is likcly. To prevent unwarranted 
stimulus variability, apparatus must be desioned to 
insuré stable long-term contact ag experimentally spe- 
cified. Stimuli should be brief to prevent momentary 
variations in posturing and orientation which can 
produce marked alterations in stimulus intensity. 
Additionally, continuous feedback to the experimen- 
ter regarding contact dimensions such as skin resis- 
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tance, electrode pressures, etc., must be routinely 
available on a moment-to-moment basis. 
Response-sensor mechanisms must be located ap- 
propriately within the experimental space so as to 
make optimal contact with performances as they oc- 
cur. No control via reinforcement of successive re- 
sponse approximations (‘‘shaping’’) or contingency 
management is possible in such experiments. The ex- 
perimenter records the behavior where he finds it. 
Constant attempts at improving the suitability of con- 
tact surfaces and feedback stimuli are required, as are 
sensors which are indestructible and operate reliably 
over hundreds of thousands of occurrences. It is also 
necessary to tailor chamber spaces to suit an individ- 
ual subject’s physical and behavioral variations, 
Figure 1 illustrates an apparatus which we have 
designed for testing biting attack résponseés in mice. 
The mouse here shown while biting a small protrud- 
ing nylon object is restrained in a small plexiglas 
tube. The tube is removable from the apparatus so 
that most subject preparation can be accomplished in 
the home colony. Subsequent to placement in the test- 
ing apparatus, the subject’s tail is cleansed and placed 
under two contact electrodes at the rear of the ap- 
paratus. In front of the subject is an object (in most 
cases a small bit of flexible nylon) attached to a vari- 
able force and displacement sensor. A standard tele- 
graph key has proved useful for this purpose, The 
method has been suitable for the study of effects of 
genetic variations, drug influences, and social living 
conditions on attack behavior. Figure 2 is an illustra- 


Fig. 1. Photograph of apparatus 
used for testing the effects of 
aversive stimulation on biting 
attack by mice. The subject 
shown here biting a protruding 
nylon object is partially re- 
Strained in a cylindrical cap- 
sule mounted at the center of 
the apparatus. The subject’s tail 
is lightly taped to a plexiglas 
rod. Two brass electrodes rest 
upon cleansed and _ prepared 
portions of the tail. The nylon 
bite object is attached directly 
to a- telegraph key which allows 
force and displacement adjust- 
ments. 
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Fig. 2. Apparatus used for testing effects of aversive stimulation 
upon manual manipulative, biting attack, and drinking reac- 
tions in the squirrel monkey. The subject is restrained by a 
waist-lock assembly and is seated on two plexiglas rods. The 
tail is placed in a stockade device. ‘Two brass electrodes rest 
upon cleansed and prepared portions of the distal section of 
the tail, A rubber hose may be connected between two pipe 
stanchions located external to the right and left walls of the 
test space. Compressive forces possible only by biting attack 
are recorded via an air flow switch mounted at the rear of the 
intelligence pancl. A respons¢ lever may be sock mounted on 
the lower left hand quadrant of the removable front intelligence 
panel. In studies where chain pulling was measured, the chain 
is suspended from the chamber ceiling. 


tion of the apparatus used for the study of squirrel 
monkey subjects. The chamber was originally devel- 
oped by Drs. D. F. Hake and N. H. Agrin at Anna 
State Hospital for work on shock-avoidance behavior 
(Hake & Azrin, 1963; Hake, 1968) and incorporates 
a number of helpful featurés for thé study of élicitéd 
behavior. Gross physical movement of the subject is 
controlled such that an aversive stimulus can be ap- 
plied precisely through electrodes upon a shaved por- 
tion of the tail which is restrained under a stockade. 
The upper torso and limbs are left unrestrained so 
that a considerable range of behavior is possible. In 
studies of biting attack, a rubber hose is suspended 
several inches in front of the subject approximately at 
head level. Bites on the rubber hose produce sufficient 
air displacement to trigger an air flow switch mounted 
on the rear of the panel, but grasping, tugging, and 
pulling have too little effect to trip the switch. A 
drink tube and drinkometer to record drinking can be 
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mounted on the panel either in front of, or to the left 
of the subject. Response levers, chains, etc., can be 
mounted suitably for manual response contact. 

The human testing paradigms have been used to 
study the phyletic continuity process of aggression and 
anger. These paradigms in earlier tests with other 
species revealed the effects of social contingencies, 
symbolic communication upon operant and respon- 
dent processes, and the effect of several drugs upon 
aggressive reactions, 

Studies with humans have required preliminary de- 
velopment of force transducers inserted directly into 
the mouth. These devices have allowed perfection of 
the methods of recording biting contractions externally 
and without awareness by the subjects. Electrodes are 
placed over the temporalis and masseter muscles at 
positions illustrated in Figure 3. The muscles control- 
ling the eccentric and concentric occlusal patterns 
contract, 1.¢.,, biting occurs, during beth the presenta- 
tion of aversive stimuli and subsequent to the with- 
drawal of positive reinforcers (Hutchinson & Pierce, 
1971; Pierce, 1971; Proni, 1973). Small needle elec 
trodes are used rather than surface clectredes, as these 
allow far more precise, noise-free recordings of elec: 
tromyosraphic (EMC) potentials which covary with 
bite contraction force (Hutchinson & Pierce, 1971), 


HUMAN BITE 
ELECMRODE -BLACEMEN:S 


- 


OEMG 
@ INDIFFERENT 


Fig. 3. Schematic illustration of electrode placement positions 
for recording of temporalis and masseter electromyographic 
activity. The indifferent electrodes are paralleled electrically to 
provide a balanced reference. The electrode on the nose is a 
silver disc. Standard clip electrodes are used on each ear. All 
other electrodes are standard 14” subdermal EEG needle elec- 
trodes. 
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Fig. 4. Photograph of the actual test space for assessing biting 
and other motor responses by human subjects subsequent to 
aversive stimulation. Electrode leads from head recording areas 
are provided strain relief under a standard athletic bandage 
wrapped loosely around the forehead and returned to an input 
box at the rear of the chair. Other recording electrodes sre 
evident on the forearm and wrist. On the intelligence panel 
of the console are signal lights and response buttons. At center 
left is an intercom for continuous communications with the 
experimenter or other subjects during social procedures. A 
Magazine cup for delivery of coins is also provided. The test 
chamber is linked visually via a window arrangement with an 
identical test space, The two spaces may be linked or separated. 


EMG recordings from the forearm, neck, and other 
areas allow confirmation of differential activity of the 
temporalis and masseter muscles relative to these lat- 
ter muscle groups at different times. An actual test 
setting of a human subject is shown in Figure 4. After 
preparation for an experimental session a subject is 
seated in electrostatically and acoustically isolated 
test space where various stimuli and responses may be 
recorded. ‘Test spaces are arranged in pairs to allow, 
when experimentally desirable, visual and auditory 
contact with partners in social experiments. 


BEHAVIOR CAUSED BY AVERSIVE 
STIMULATION 


This section will discuss the present state of our 
understanding of several major classes of behavioral 
sequences as they relate to the occurrence of aversive 
stimulation and to one another. Historically, our lab- 
oratory has worked to develop methods and _ tech- 
niques for the objective long term study of aggression- 
attack sequences in animals and man. The departure 
point for this work is the series of studies by Ulrich, 
Azrin, and their colleagues (Azrin, Hake, & Hutchin- 
son, 1965; Azrin, Hutchinson, & Hake, 1963; Azrin, 
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Hutchinson, & Sallery, 1964; Hutchinson, Azrin, & 
Hake, 1966; Ulrich, & Azrin, 1962; Ulrich, Hutchin- 
son, & Azrin, 1965). 

As experience has been gained in the sensing of 
complex reactions in several species, it has become 
apparent that the aggression-attack reaction to aver- 
sive stimulation is only one of several identifiable 
behavioral sequences and patterns which relate to one 
another and to aversive stimulation in a reliable 
fashion. 

Numerous studies have shown that the delivery of 
an intense aversive, noxious, or unpleasant stimulus 
will produce, in a variety of species, movement to- 
ward, contact with, and possibly destruction of, an- 
imate or inanimate objects in the environment (Ulrich 
& Azrin, 1962; Ulrich, Hutchinson, & Azrin, 1965). 
Figure 5 illustrates some temporal and intensive rela- 
tions which have been recorded from several species 
following delivery of a noxious or aversive stimulus. 
Shortly after the stimulus event, attack or biting be- 
gins at a high intensity. Repetitions of this response 
are likely with the frequency and intensity gradually 
falling over a period of seconds or minutes (Azrin, 
Hutchinson, & Sallery, 1964; Hutchinson, Azrin, & 
Renfrew, 1968). 

If aversive stimuli are repeatedly delivered in a 
discriminable temporal pattern a display of aggression- 
attack reactions assumes additional features. During 
the period prior to an ensuing aversive stimulus (and 
at a discriminable temporal period after aversive stim- 
uli) additional biting attack reactions will occur. Fig- 
ure 6 illustrates this temporal pattern for one species 
and subject. Biting attack reactions occur for a period 
before the occurrence of an aversive stimulus, but 
usually cease just before shock (Hutchinson & Emley, 
1972; Hutchinson, Renfrew, & Young, 1971). 

Although aggression-attack sequences often occur 
in reaction to conditional stimuli, more recent studies 
have shown that other reaction sequences are also 
likely. These reactions include sensory scanning, man- 
ual manipulation, and locomotion sequences (Hutch- 
inson & Emley, 1972; Hutchinson, Renfrew, & Young, 
1971). Figure 7 illustrates the automatic recording of 
shock-induced lever pressing in a squirrel monkey and 
noise-induced movement by a human. During the 
presentation of a conditional stimulus, activities oc- 
cur at a progressively increasing rate until just before 
the aversive stimulus, when all reactions cease. 

Sensory scanning, and locomotion and manual 
manipulative behaviors are prepotent to attack reac- 
tions during the period prior to unconditional stimu- 
lus occurrence (Hutchinson & Emley, 1972; Hutchin- 
son & Emley, 1973). Figure 8 shows the records of a 
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Fig. 5. Automatic recordings of biting responses by a mouse. a 
squirrel monkey, and an adult male human subsequent to 
aversive stimulation. The mouse subject was of the Swiss- 
Webster strain, Shock was via tail electrodes, at 600 v for 200 
msec. Biting attack upon a nylon object began shortly there- 
after. ‘The nylon bite object was connected to a Statham force 
transducer, the output of which was amplified and recorded by 
a Grass Model 5 polygraph. Static peak force required to pro- 
duce excursions as seen here is approximately 35 g. The squirrel 
monkey was shocked for 200 msec via tail electrodes. ‘The pres- 
sure tracing of biting was obtained from a Statham P23 pres- 
sure transducer, the output of which was amplified and dis- 
played by a Grass Model 5 polygraph. ‘The human subject was 
stimulated with 110 db of 2000-Hz tone. Speakers were mounted 
on either side, and several feet to the front of the subject. 
Biting, in the form of nonfunctional, concentric occlusion, 
began shortly thereafter. Responses were recorded with sub- 
dermal EEG electrodes. Signals were amplified, integrated, and 
recorded by a Grass Model 5 polygraph. 
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Fig. 6. Reconstruction of actual event records obtained from 
one squirrel monkey subject of biting attack responses both 
prior and subsequent to aversive stimulation. Shock was 400 v, 
delivered for 200 milliscc cach four min, 


monkey and a human male when the separate réac- 
tion classes could each occur. Lever presses of the 
monkey, and movements of the torso and upper limbs 
of the male human, progressively increase in fre. 
quency during a period of conditional stimulus de 
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Fig. 7. Reconstruction of actual event records obtained from a 
squirrel monkey subject and an adult male human during 
periods of aversive stimulation. For the squirrel monkey, electric 
shock was 400 v for 200 msec delivered every 4 min. The re- 
sponse lever was mounted on the intelligence panel immediately 
ahead of and to the left of the subject. For the human, aversive 
stimulation was 2 seconds of 2000-Hz pure tone delivered each 3 
minutes at an intensity of 110 db (measured at the amplifier 
output transformer). Aversive stimulation is preceded by pro- 
gressively increasing and then decreasing frequencies of manual 
responding and/or the physical movements of upper torso or 
arms. 
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Fig. 8. Reconstruction of actual event records obtained for a 
squirrel monkey subject and an adult male human. For each 
subject simultaneous records of pre-aversive stimulus manual 
manipulative or movement reactions and post-aversive stimulus 
biting was obtained. Parameters of aversive stimulation were 
as described previously, 


livery, but immediately prior to unconditional stimu- 
lus delivery these reactions cease. Following delivery 
of the unconditional stimulus biting reactions occur. 

The addition of apparatus suitable for sensing a 
second reaction of different topography is not only a 
technical exercise. Such a change alters the environ- 
ment and may influence the relative and combined 
behavioral expressions which occur. An example of 
the influence of a change in features of the environ- 
ment (established at the convenience of the investiga- 
tor to allow simultaneous sensing and recording of 
several responses) upon the several reactions which 
may occur is illustrated by the upper graph in Figure 
9. In Figure 9, the aggression-attack reactions of bit- 
ing a rubber hose, and the manual-manipulative and 
sensory-scanning reactions involved in pulling a chain, 
are influenced by the presence or absence of the op- 
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portunity to bite (Hutchinson & Emley, 1972; Hutch- 
inson, 1970). The pair of cumulative records obtained 
on Day 5, for a squirrel monkey subject illustrates 
that chain-pulling responses occurred predominately 
before the deliveries of electric shocks. Immediately 
before shock delivery, chain pulling was absent. Sub- 
sequent to shock delivery, biting attack occurred and 
then progressively diminished. Later, for a series of 
tests, the rubber hose was removed from the chamber 
for the entire session. Both post-shock and pre-shock 
chain pulling increased. These effects can be noted on 
Day 22 and Day 34. After replacement of the hose, 
performance returned to the earlier pattern. An exag- 
gerated illustration of this dual effect of interaction 
between the pre-event and post-event behaviors is 
illustrated in the lower graph of Figure 9. Here for a 
squirrel monkey subject, tests were conducted of the 
interactions between hose biting and lever pressing. 
Studies of such interactions are important in under- 
standing the separate processes for two reasons. On 
the one hand, the elimination of the opportunity to 
engage in biting attack reactions subsequent to the 
delivery of the aversive stimulus generated greater 
numbers of anticipatory or pre-shock manual manipu- 
lative reactions—an effect similar to only one other 
condition known to us—that produced by an increase 
in shock intensity or duration. Thus, preventing at- 
tack subsequent to aversive stimulation is function- 
ally similar to increasing the intensity of an aversive 
stimulus. This suggests the possibility that the influ- 
ences of a biting attack are similar to those of shock 
reduction (Hutchinson, Renfrew, & Young, 1971). At 
a more speculative level, this effect may account for 
at least a portion of the reinforcement inherent in bit- 
ing attack sequences which has been reported in 
previous studies (Azrin, Hutchinson, & McLaughlin, 
1965). 

Of greater relevance to the present discussion is 
the second effect noted: removal of the opportunity to 
attack caused an increase in post-shock manual manip- 
ulative and sensory scanning responses of chain pull- 
ing and lever pressing. We have repeatedly found that 
post-shock manual responses have increased when the 
opportunity to attack was absent, to a level almost 
identical in absolute number with the frequency of 
post-shock attack responses which occurred during 
periods when a hose was present (Hutchinson, 1970; 
Hutchinson & Emley, 1972). Conversely, the provision 
of opportunity to attack causes a shift from post-shock 
manual manipulative and sensory scanning responses 
to biting attack. When allowed, attack reactions are 
prepotent over other locomotor and manipulative re- 
actions subsequent to aversive stimulation: this is the 
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Fig. 9 (upper). Cumulative response records for bling attack 
and chain pulling by one squirrel monkey subject before, dur- 
ing. and after removal of the opportunity to engage in biting 
attack responses, During the period illustrated on Days 22 and 


534 while the bite hose was removed from the test chamber, the 
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subject demonstrated increased pre-shock chain-pulling ¥esponse 
bursts equivalent to response bursting of biting attacks during 
previous periods of hose availability. Subsequent to return of 
the opportunity to attack, performance returned to original 
levels. (lower). Cumulative response record segment showing 
magnified illustration of the increase of pre-shock lever presses 
and the instatement of post-shock lever-press burst reactions 
subsequent to the removal of the opportunity to attack. The 
inter-shock interval was 4 min. 


suggestion presented, but it does not take into account 
several features of the testing apparatus and methods 
employed in the studies. First, the testing environ- 
ments all include features of physical or social re- 
straint. For infrahuman subjects, cage arrangements 
have stockades or other restraint devices which guaran- 
tee contact between an applied aversive stimulus and 
the organism. Additionally, the stimulus is delivered 
for only a brief instant to minimize the possibility 
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that any movements or other efforts by the subject 
might be followed by a reduction or termination of 
the stimulus and thus be reinforced. With the human 
testing techniques, actual reliable contact between an 
aversive stimulus and the subject is provided by the 
recruitment of volunteers who are fully informed of 
the noxiousness which may occur. In addition the 
volunteers are assured that no unreasonable or poten- 
tially hazardous stimulation will be used, that finan- 
cial remuneration will be received, that only mild 
noxious stimuli will be used, and that escape is always 
immediately available. In summary, for both animals 
and humans, the testing paradigms have been de- 
signed to eliminate a class of reactions which is known 
to be highly probable during or immediately subse- 
quent to the occurrence of an aversive stimulus, e.g., 
physical movement and escape from the noxious stim- 
ulus. Innate reactions of flinching, jumping, and run- 
ning immediately after application of an aversive 
stimulus are thoroughly documented in the litera- 
ture (Brogden, Lipman, & Culler, 1938; Gampbell & 
Teghtsoonian, 1958; Liddell, 1934). Also, in more 
recent studies, it has been shown that an ex perimen- 
tally learned escape réaction will become prepotent 
over aggression-attack sequences (Azrin, Hutchinson, 
& Hake, 1967: Ulrich, 1967). 

Are there reactions which are equally or more 
potent than attack sequences in enyirenments where 
ne reinforcement fer such non-attack behavior oc- 
curs? In fact the temporal primacy of locomotor and 
manual manipulative reactions over aperession-attack 
behaviors has been observed, A minority of subjects in 
our laboratory continue to make a small number of 
manual responses for lone perisds subsequent ts shock 
delivery and before attack, even when attack epper- 
tunity is present. Figure 10 is an illustration of the 
high speed event records obtained for both lever 
presses and biting attacks, Whereas lever presses domi- 
nate before shock delivery, the typical pattern of bit- 
ing attacks predominates after shock delivery. The 
important feature of this illustration, however, is the 
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Fig. 10. Reconstruction of actual event records obtained for 
two squirrel monkey subjects of lever-press responding and 
biting attacks. Note that lever presses which occur subsequent to 
shock delivery, occur prior to ensuing episodes of biting attack. 
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Fig. 11. Cumulative response record segments for simultaneously 
recorded manual manipulative and aggressive attack sequences 
for four different squirrel monkey subjects. ‘The general pattern 
for each subject is a relatively greater probability of pre-shock 
manual manipulative responding and a relatively greater proba- 
bility of post-shock biting attack responding. Individual differ- 
ences from these general patterns are circled and are discussed 
in the text. 


temporal position of the manual behaviors which do 
occur following shock. When such manual responses 
occur, they predominately occur immediately after 
shock and before the biting attack responses. Thus, 
even in environments where there has been no rein- 
forcement for long periods of time, a number of sub- 
jects continue to engage in locomotion, manual 
manipulation, and sensory scanning reactions immedi- 
ately after shock and before the ageressive-attack reac- 
tion sequences. 

Considerable individual variation in the temporal 
and intensive patterns of the reaction sequences is 
typically observed (Hutchinson & Emley, 1972; Hutch- 
inson, Renfrew, & Young, 1971). Figure 11 1s an ex- 
ample of four subjects. For each subject, the relative 
probability of manual manipulations is greatest before 
shock and the relative probability of biting attack 
greatest after shock. For several subjects, there is also 
a tendency toward the brief absence of all reactions 
immediately prior to shock, Each subject, however, 
provides a slightly different variant on these general 
statements. Subject MC-30 illustrates a relatively clear 
temporal differentiation of these reaction patterns. 
MC-1 shows some post-shock as well as pre-shock lever 
pressing. MC-12 shows some pre-shock biting attack 
in addition to greater pre-shock lever pressing. MC-13 
shows both post-shock lever pressing and pre-shock 
biting attack. Currently our understanding of these 
processes does not allow us to be certain whether these 
differences depend on constitutional or on technical 
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differences in our handling methods or recording ap- 
paratus. Nevertheless, the general pattern is clear. 
Manual manipulative, locomotor, and sensory scan- 
ning responses occur prior to noxious stimulation, but 
cease immediately before aversive stimulation. Fol- 
lowing the aversive stimulus, tendencies towards loco- 
motion, sensory scanning, and manual manipulation 
are later followed by aggression-attack sequences. 

The behavioral sequences which occur both before 
and after aversive stimulation may vary even when 
stimulus parameters are held constant. ‘Iwo separate 
and opposite processes, habituation or facilitation, can 
occur even with invariant features of aversive stimuli. 
If aversive stimuli are relatively mild and/or frequent, 
the reaction sequences show progressive reductions in 
amplitude and frequency over successive occurrences of 
the stimulus (Hutchinson, Renfrew, & Young, 1971; 
Ulrich & Azrin, 1962). Figure 12 (upper panel) illus- 
trates the cumulative response records of post-shock 
attack behaviors over a series of relatively mild and 
frequent shock occurrences. Successive shocks result in 
progressively fewer attacks. 

The repetitive, but infrequent delivery of intense 
aversive stimuli can result in increased responses or 
facilitation, rather than habituation. Figure 12 (lower 
panel) illustrates for one squirrel monkey subject the 
effect of infrequent, intense shock deliveries upon 
pre-shock lever pressing and post-shock biting attack. 
The rate of both responses increases progressively un- 
til responding is almost continuous. Some time is 
always necessary for the development of this response 
facilitation effect; somewhere between 20 and 40 min- 
utes seems to be the necessary amount of time for 
several species tested. Once the process is begun it is 
possible to terminate all aversive stimulation, yet 
continue to observe the occurrence of these reactions 
for hours and even days (Hutchinson & Emley, 1972; 
Hutchinson, Renfrew, & Young, 1971). 

Due to the processes of habituation and facilita- 
tion, a momentary measurement of the amplitude or 
frequency of display of reactions is not a reliable esti- 
mate of the current level of “‘aversiveness” to which 
the organism is exposed, since, depending upon the 
history of aversive stimulus encounter, the subject’s 
current reactions may be excessive or diminished by 
considerable degree relative to what another subject 
or the same subject would have shown at an earlier 
time. 

Further experiments have found that additional 
reaction sequences are influenced by the occurrence of 
aversive stimulation. After the delivery of an aversive 
stimulus, a series of manual manipulative and loco- 
motive responses, or after a series of biting attack 
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Fig. 19 (upper). Cumulative response record for one squirrel 
monkey subject during a session where frequent and relatively 
mild aversive stimulation was delivered. Habituation of re- 
sponding over successive shock episodes is evident. Shocks were 
100 msec duration, 100 V intensity, delivered once each minute. 
(lower). Cumulative response record for one squirrel monkey 
subject where biting attack reactions and manual lever pressing 
responses were simultaneously recorded during response facili- 
tation over successive shock occurrences. Shock was 400 v deliv- 
ered for 200 msec every 4 min. 


reactions, subjects begin drinking (Hutchinson & Em- 
ley, unpublished research). Figure 13 (upper graph) 
presents event recordings for two subjects during a 
period before and after electric shock delivery. For 
each subject the typical pattern of manual manipula- 
tive and locomotor reactions (in this case lever presses) 
is apparent for both subjects during the pre-aversive 
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stimulus period. This behavior is absent for a brief 
period immediately before the aversive stimulus oc- 
curs. After shock delivery there is a rapid flurry of 
biting attacks. Following this, both subjects show a 
series of lick responses which first increase and then 
decrease over ensuing seconds. Figure 13 (lower panel) 
shows in cumulative record form, the response pat- 
terns for lever presses, bites, and licks for one subject. 
Pre-shock lever pressing increases up until almost time 
for shock delivery, but shows a tendency for reduction 
immediately before shock. Subsequent to shock, a 
rapid series of biting attack reactions occurs. Follow- 
ing this, a negatively accelerated burst of water lick- 
ing responses takes place. 

Figure 14 shows the temporal distribution of man- 
ual manipulative and locomotor responses, biting at- 
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Fig. 13 (upper). Reconstruction of actual event records obtained 
for two squirrel monkey subjects when manual lever pressing, 
biting attacks, and water licking were each simultaneously 
measured during periods of aversive stimulation. Following 
biting attack, subjects begin drinking for several seconds. 
(lower). Cumulative response record segments for one squirrel 
monkey subject of manual lever pressing, biting attacks, and 
water licking during periods of aversive stimulation. 
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Fig. 14. Temporal distribution of manual lever pressing, biting 
attack reaction and water intake responses prior and sub- 
sequent to aversive stimulation. Responses are totalled over 12- 
sec intervals, Shock was delivered at “0” seconds. 


tack responses, and licking responses averaged across 
an entire session for one subject. Here, in the in- 
dividual intershock intervals, lever pressing shows a 
progressive increase in the period before shock. As the 
minimum class intervals are 12 seconds, no abrupt 
termination of responding in the few seconds prior to 
shock is evident, although this is frequently seen in 
individual records. Subsequent to shock delivery, a 
brief series of lever presses occurs, followed immedi- 
ately by biting attacks upon the rubber hose. This 
attack reaction subsides progressively and then water 
licking responses occur first at a higher and then at a 
lower frequency falling over some seconds to zero. 

Limiting or expanding other response opportuni- 
ties has effects on the magnitude or frequency of 
drinking responses which are similar to those previ- 
ously described for lever pressing and biting. Remov- 
ing the opportunity to attack, by removing the rubber 
hose from the chamber, increases the number of licks 
and the amount of water consumed. These reaction 
shifts remain for as long as the environment is altered. 
Figure 15 portrays the effect for one subject in succes- 
sive experimental sessions where the hose first was pres- 
ent then removed for several days and then returned 
to the chamber. Removal of the hose, and thus the 
Opportunity to attack caused biting attacks to again 
quency of licking subsequent to shock. Return of the 
opportunity to attack caused biting attacks to again 
occur as before. Licking responses were reduced to 
their original level. 

Drinking reactions are also influenced by the tem- 
poral pattern and intensity of shock delivery in a 


BY-PRODUCTS OF AVERSIVE CONTROL 


LICKS 


MC-30 


HOSE PRESENT 


E2904 


CUMULATIVE RESPONSES 


t—l2 MINUTES 


Fig. 15. Cumulative response record segments of fluid intake 
responses for one squirrel monkey subject before, during and 
subsequent to removal of the opportunity to attack. Intake 
responses and the amount of fluid consumed are increased in 
the absence of attack opportunity, 


manner similar to that already illustrated for pre 
shock manual responses and post-shock biting attack 
reactions. If intense aversive stimuli are delivered 
infrequently, licking responses will begin to occur at 
a far greater rate and in an almost continuous episode 
as compared with earlier occasions. Figure 16 dis 
plays for one subject, this facilitation of drinking re- 
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Fig. 16. Cumulative response record segments of fluid intake 
responses for one squirrel monkey subject in early and later 
portions of a period of aversive stimulation showing develop- 
ment of response facilitation. 


R. R. Hutchinson 


actions when continuing, infrequent, intense, and 
aversive stimuli are presented. 

From the foregoing discussion and illustrations, it 
is evident that occurrence of aversive stimuli can in- 
fluence a series of complex motor reactions for a 
period after aversive stimulation and, through condi- 
tioning, to stimulus periods prior to the delivery of 
a particular ensuing aversive stimulus. Further, there 
is a predictable ordering of the frequencies or tenden- 
cies for various reaction sequences to occur, both prior 
to, and after aversive stimulus delivery. During condi- 
tional stimulation, locomotor, manual manipulative, 
and sensory scanning reactions occur at a progres- 
sively increasing frequency until some period immed1- 
ately before the aversive unconditional stimulus de- 
livery. During that same period, occurrence of aggres- 
sion-attack sequences and drinking reactions can 
occur, though these are of lesser strength than the 
locomotor and manual manipulative reactions. Im- 
mediately prior to unconditional stimulus delivery, all 
reaction tendencies cease, a period of behavioral arrest 
or suppression occurs, After the occurrence of an un- 
conditional aversive stimulus, locomotor, manual ma- 
nipulative, and sensory scanning responses are maximal 
for a brief period, Subsequent to this time, aggression- 
attack reaction sequences develop to a high frequency 
and then gradually return to zero. Later yet, and sub- 
sequent to attack reactions, drinking responsés occur, 
first, at a progressively increasing and then a gradu- 
ally decreasing frequency over ensuing seconds and 
minutes. These sequential relationships are sketched 
symbolically in Figure 17. 

Manual manipulation, arrest, attack, and drink 
reactions may or may not occur in the particular stim- 
ulus context present at the time a subject is respond- 
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Fig. 17. Schematic illustration of the sequential features of be- 
havioral processes produced by aversive stimulation. During 
the conditional stimulus period, locomotor, manual manipula- 
tive and visual scanning responses are progressively elevated. In 
a more imminent temporal position to unconditional stimula- 
tion, behavioral reactions are arrested or suppressed. Subsequent 
to unconditional stimulus delivery, manual manipulative, loco- 
motor and visual scanning responses are increased. Subsequent 
to this, aggression-attack sequences occur. At yet a later point 
in time, fluid intake responses begin. 
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ing. Sometimes, only partial sequences will occur since 
alternate response reaction classes may, through rein- 
forcement, completely or substantially displace in 
time and strength the basic reaction patterns noted. 

Fach of these reaction types involves a complex se- 
quence of coordinated movements, frequentiy involv- 
ing many muscle groups in concert with sensory 
scanning processes acting upon particular objects or 
places in the environment. So far, our findings sug- 
gest that these reactions, though flexible and capable 
of considerable modification, nevertheless tend to oc- 
cur in an essentially identical fashion in many mem- 
bers of the same species. Additionally, individual 
subjects show great consistency between successive 
behavioral episodes over long periods. Such species 
specificity, coupled with uniformity across successive 
instances makes it tempting to refer to these reactions 
as réflexés or innate reactions. We have refrained 
from this for several reasons. We haye little evidence 
regarding the portions or features of these perfor- 
mances which may be independent of learning. Also, 
the modification of thésé réaction sequences which 
comes about from alterations in post-response environ- 
ment (reinforcement) implies a reliance on feedback 
that is perhaps not a common feature of other proc- 
esses referred to as reflexive or innate, and has been 
little emphasized in connection with them. 


BEHAVIOR CAUSED BY AVERSIVE 
STIMULI IN ESCAPE PARADIGMS 


Operant conditioning procedures involving aver- 
sive control may be divided into the response-produced- 
stimulus-oflser (“escape”) and response-preduced- 
stimulus-onset (“punishment”) paradigms. In escape 
conditioning or performance paradipms some aspect 
of behavior results in the termination of either the 
conditional aversive stimulug or the unconditional 
aversive stimulus. Often where the conditional stimu- 
lus ig terminated the paradigm is defined as avoid- 
ance conditioning—a reference to the fact that such 
responding also avoids occurrence of the unconditional 
stimulus. For both unconditional-stimulus and condi- 
tional-stimulus escape, the response occurs in the pres- 
ence of an identifiable unique aspect of the environ- 
ment. As such, the reaction sequences which will 
occur can be predicted, at least in part, by knowledge 
of how behavior is influenced directly by aversive 
stimuli as described in the previous section. 

Several general behavioral relationships are ob- 
served during escape/avoidance-learning and _ perfor- 
mance. During presentation of the conditional stimu- 
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lus, locomotor, manual manipulative, and visual 
scanning reactions will be likely (Brogden, Lipman, & 
Culler, 1938; Miller, 1948; Mowrer & Lamoreaux, 
1942). As the conditional stimulus continues toward 
the time ordinarily corresponding to introduction of 
the unconditional stimulus, behavior will initially in- 
crease in probability and then decrease (Anger, 1963; 
Hoffman, 1966; Sidman, 1955). After the occurrence 
of the unconditional stimulus, a brief period of 
heightened probability of manipulative, locomotor, 
and visual scanning responses will be evident (Sidman, 
1958; Hoffman, 1966). Later yet, or in situations 
where escape reactions are not reinforced, aggression- 
attack sequences will occur (Azrin, Hutchinson, & 
Hake, 1967). Each of these patterns is similar to that 
described in the previous section where such _ per- 
formances were shown to result from conditional and 
unconditional stimulation directly. Figure 18 (upper 
panel) presents a segment from a cumulative record of 
one subject on a Sidman avoidance schedule, during a 
portion of the session when the unconditional aversive 


stimulus actually did occur twice. The rapid post-— 


stimulus burst of lever pressing behavior frequently 
reported in this circumstance, is evident. ‘The lower 
graph of Figure 18 presents a segment from two simul- 
taneous cumulative records obtained from another 
subject on similar experimental parameters, except 
that a bite hose was placed in the test chamber. For 
this animal, the opportunity to engage in attack se- 
quences, caused a total displacement of the post- 
stimulus lever-pressing behavior by a flurry of biting 
attacks on the rubber hose (Azrin, Hutchinson, & 
Hake, 1967). In numerous avoidance sessions where 
the attack opportunity is present we have observed 
that subjects typically show a nearly or totally com- 
plete shift in the post-shock reaction sequences to that 
pattern shown in the lower graph of Figure 18. As 
suggested on the basis of data presented in earlier 
sections, biting may somehow serve to reduce the 
effect of shock. Coupled with an essentially zero rein- 
forcement probability for bursting responses on the 
lever, this may produce alteration of the two per- 
formances seen in the lower portion of Figure 18. 
The literature also shows that escape or avoidance 
behavior is often absent or infrequent early in a test- 
ing session even after learning seems stable in earlier 
tests. This daily or weekly recurrent absence of initial 
strength coupled with the progressive increase in re- 
sponding over subsequent shocks during testing has 
been referred to as “warm up” (Hoffman, 1966; Hoff- 
man, Fleshler, & Chorny, 1961). In the previous sec- 
tion the similar facilitation of shock-produced_per- 
formances by repetitive shock deliveries was illustrated. 
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Thus the temporal and intensive patterns of man- 
ual and locomotor behaviors observed on schedules 
involving escape-avoidance procedures are similar or 
identical to those observed routinely when such con- 
tingencies are not in effect. Therefore, to what extent 
are the performance elements, normally observed un- 
der escape and/or avoidance training and learning 
procedures, attributable to the contingencies rather 
than as a direct result of contact with conditional and 
unconditional aversive stimuli? This question can 
only be answered conclusively, it seems to me, by a 
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Fig. 18 (upper). Cumulative response record segments during 
Sidman avoidance conditioning for one squirrel monkey subject. 
Subsequent to shock delivery, a brief flurry of lever presses 
occurs. Avoidance parameters were response-shock interval, 30 
sec, shock-shock interval, 30 sec, shock intensity 400 v 200 milli- 
sec duration. (lower) Cumulative response record segments 
during a Sidman avoidance program. Simultaneously recorded 
manual lever pressing and biting attacks for one squirrel monkey 
subject are shown. Note that subsequent to shock delivery, the 
flurry of responding occurs upon the bite hose. 
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careful three-part experimental sequence. Subjects 
must first be exposed for an extended period to the 
conditional and unconditional stimuli in the absence 
of any contingencies. Subsequent to this, contingencies 
may be introduced and shifts in performance noted. 
To establish the certainty of the contingency’s long 
term influences, at a later time a return to original 
noncontingent stimulation must be arranged. Un- 
fortunately, these conditions are almost never met 
experimentally. Typically, a shift in some feature of 
the contingencies or in their temporal or interlocking 
character, has been the primary experimental manipu- 
lation and a shift from one level of performance to 
another has been the observed result. Such tests are 
helpful, but they leave unanswered the question of 
what topographic and intensive features of behavior 
are independent of, or dependent on, contingencies 
per se. In cur studies we typically do not find that 
subjects respond at high continuous response rates 
during response-independent stimulus presentation. 
A shift to avoidance conditions from response-inde- 
pendent shock usually produces large increments in 
response rate. An example of such an effect may be 
seen in Figure 19. Here selected segments of cumula- 
tive records under three successive experimental con- 
ditions are displayed. This particular subject had a 
two-year history of response-independent shock every 
four minutes for one hour sessions. During this period 
a variety of tests were performed using this behavioral 
baseline. ‘The baseline behavioral pattern is shown as 
the uppermost segment of Figure 19. At this point 
avoidance conditioning was begun. By Day 22 of the 
avoidance program a high steady rate of nearly shock- 
free responding was occurring. In subsequent tests the 
subject was returned to response-independent shock 
conditions and behavior was observed over an ex- 
tended period. The rate of responding gradually fell 
to the level seen in the bottom segment of Figure 19. 
Two features of these results are important. First, it 
took a long time for performance to stabilize at a 
lower level after exposure to avoidance conditioning; 
second, the level at which performance stabilized was 
10-15 times greater than it had been before the 
avoidance training. Each of these results has also 
been noted in earlier studies (Kelleher, Riddle, & 
Cook, 1963; Sidman, 1960). 


BEHAVIOR CAUSED BY AVERSIVE 
STIMULI IN PUNISHMENT PARADIGMS 


In punishment paradigms, presentation of the con- 
ditional or unconditional stimulus is contingent upon 
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Fig. 19. Cumulative response record segment for one squirrel 
monkey subject during response-independent shock, Sidman 
avoidance conditioning, and subsequent response-independent 
shock sessions. Avoidance conditioning results in a large in- 
crease in manual responding compared to initial response- 
independent shock baseline performance. Subsequent to avoid- 
ance, responding gradually falls until it stabilizes but at a level 
of responding many times greater than prior to avoidance 
learning. 


the occurrence of some response. Since in this para- 
digm the unconditional stimulus and the conditional 
stimulus are not present during a response, but only 
afterward, the ongoing response rate will be deter- 
mined by other conditions which are present, such 
as food deprivation, food reinforcement schedule, etc. 
Generally, since the establishment of a contingency 
between a response and an unconditional aversive 
stimulus simultaneously places a negative reinforce- 
ment contingency on behaviors other than the to-be- 
punished response (Dinsmoor, 1954), and since, as has 
been shown in the prior section, the terminal effect of 
conditional stimuli is to produce depression or an ab- 
sence of behavior, any responding which results in the 
occurrence of unconditional stimulation will be re- 
duced. After the actual occurrence of an uncondi- 
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Fig, 26, Cumulative response record segments for one squirrel 
monkey subject during variable-interyal food reinforcement and 
si@iultaneous continuous punishment of manual lever pressing, 
Biting attack episedes secur subsequent to shock deliveries. Data 
are reconstructions Of event records obtained from R. E. Ulvich, 
Western Michigan University, Kalamazoo. Michigan. 


tional stimulus, however, other reaction sequences 
ordinarily produced by unconditional aversive stimu- 
lation will be observed. Figure 20 presents the simul- 
taneous cumulative records of a subject on a variable- 
interval food-reintorcement schedule and simultaneous 
continuous shock punishment for lever pressing. Also, 
the opportunity to attack a rubber hose was present. 
Note that bursts of biting attack tend to occur after 
the delivery of electric shocks (Ulrich, personal com- 
munication). Whether fluid intake responses are in- 
creased subsequent to the occurrence of punishment 
by an unconditional stimulus is unknown te me. 
When responding 1s maintained and produced by 
arrangements other than positive reinforcement sched- 
ules, additional interactions with punishment pro- 
grams can occur. When unconditional or conditional 
aversive stimulation is made contingent on _perfor- 
mances which have been generated by aversive stimu- 
lation (cither by escape-avoidance routines, or by re- 
sponse-independent shock application), the results can 
appear confusing. Sometimes in these situations, the 
contingent application of conditional aversive stim- 
ulation can actually elevate behavior. Figure 21 pre- 
sents selected cumulative record segments for two 
squirrel monkey subjects during response-independent 
shock delivery programs and subsequent response- 
contingent shock delivery schedules. The two records 
show the simultaneous results for pre-shock manual 
manipulative responses and post-shock biting attacks. 
In the case of subject MC-30, during response-inde- 
pendent shock delivery (labeled No Punishment) lever 
pressing occurred at a progressively increasing rate 
until just before shock when behavior was absent. 
Subsequent to shock delivery, biting attacks occurred, 
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first at a high rate and then at a decreasing rate. The 
subject who experienced shock delivery contingent 
upon lever pressing (labeled Punishment), produced 
the pattern illustrated, one seen in numerous other 
subjects in the laboratory. Generally, no lever press- 
ing occurred during the first few minutes of the 
response-independent shock delivery sessions, nor on 
the first day when shocks were response produced. At 
about seven minutes into the first punishment session 
the initial lever press occurred and produced a shock. 
Subsequent to this shock a flurry of biting attacks 
occurred. For this subject, no further lever presses oc- 
curred during any session for the next 20 punishment 
sessions. A quite different performance is illustrated 
in the right hand column, for subject MC-5. This sub- 
ject had a much higher rate of pre-shock manual re- 
sponding during the response-independent shock ses- 
sions. The consistent pattern was a progressively 
increasing lever press rate up to almost the instant of 
shock delivery, with at most only a very brief pause 
prior to shock occurrence. Further, little or no biting 
occurred subsequent to shock. Introduction of punish- 
ment contingencies produced an actual increase in 
pre-shock lever pressing. By Day 20 on this procedure, 
in fact, an increase in responding under contingent 
shock conditions was evident. Results similar to those 
for MC-5 have also been reported in recent experi- 
ments (Morse, Mead, & Kelleher, 1967; Stretch, Orloff, 
& Dalrymple, 1968) in other laboratories and have 
served as the evidential base for a theoretical position 
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Fig. 21. Cumulative response record segments obtained simul- 
taneously for manual lever pressing and biting attack reactions 
for two squirrel monkey subjects prior to and during punish- 
ment of lever pressing. 
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that electric shock and other aversive stimuli, can 
serve as positive reinforcer-like events, depending 
upon the schedule of presentation of such events. 
However, other interpretations are possible. Figure 
21 shows prompt and total cessation of responding for 
one subject (MC-30) when shifted from a response- 
independent shock schedule to a response-contingent 
shock schedule. Figure 22, on the other hand, shows 
that reinstatement of response-independent shock pro- 
cedures for subject MC-30 produced a gradual, but 
progressive return of responding to a pattern and 
level identical to that observed prior to punishment 
conditions. In the case of subject MC-5 and others dis- 
playing similar high response rate patterns, additional 
experiments have been conducted. Figure 23 presents 
full cumulative records for MC-5 through a series of 
experimental tests involving successive alterations in 
the fixed-interval shock punishment schedule, and 
finally followed by a return to response-independent 
shock procedures. No other response maintenance pro- 
cedures (such as fond, time out from shock, etc.) are 
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Fig. 22. Cumulative response record segments for one squirrel 
monkey subject during final testing of punishment for manual 
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ishment contingency. 
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ever a part of these experiments. In the left hand 
column, the upper tracing illustrates final perfor- 
mance under response-independent shock conditions. 
Over a series of months, the fixed-interval value of the 
punishment schedule was gradually reduced from four 
minutes to two minutes, one minute, 15 seconds, and 
finally to 5 seconds. During these tests responding was 
not reduced, rather, overall rate of response increased. 
‘Throughout these tests, it was noted that not on a 
single instance was responding low enough to cause 
shock to be delayed more than a few seconds, as com- 
pared to when shock would have occurred on a 
response-independent shock baseline. The high fre- 
quency of responding exhibited by MC-5 caused the 
shock delivery to occur without change, just qs though 
scheduled by a clock without regard to behavior. For 
reasons unknown to us, this situation was altered for 
the first time at point A on Day 18 of the 5-sec FI 
punishment condition, shown in the center column 
of Figure 93. On the next day (Day 19} the pattern of 
performance was immediately different from the be- 
ginning of the session. No responding occurred for 
almost 4 minutes. The first response produced a 
shock. This happened three more times during session 
19. Subsequently, no responding was observed during 
any of the 50 additional punishment test sessions. 

A return to response-independent shock conditions 
also produced for this subject, a gradual increase in 
responding until it assumed the same temporal and 
intensive pattern initially.displayed prior to punish- 
ment testing. This behavior which might have been 
interpreted as “maintained” by the response-produced 
shock contingency, was in fact, a shock-produced- 
response performance. Preyious reports of similar 
effects have in all cases inadvertently employed one or 
more of several different collateral response-producine 
or response-reinforcing procedures: Frequently thess 
studies have employed an avoidance history (Byrd, 
1969; Kelleher & Morse, 1968; MecKearney, 1968: 
Stretch, Orloff, & Dalrymple, 1968). In the previous 
section we saw that an avoidance history, even long 
past, could markedly increase the response generating 
effects of subsequent response-independent shock de- 
livery schedules. Other experiments have employed 
free-shock baselines (McKearney, 1969; Morse, Mead, 
& Kelleher, 1967). In this and earlier sections we have 
shown that free shock can generate responding. In 
still other experiments, simultaneous reinforcement 
(Azrin, Holz, Hake, & Ayllon, 1963) procedures such 
as time out from shock (McKearney, 1970; Morse & 
Kelleher, 1970), or previous food reinforcement his- 
tories have been used (Morse & Kelleher, 1970). Such 
procedures produced high levels of responding neces- 
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sary to both compete with the response-sup pressing 
effects of the punishment schedule and jor to generate 
behavioral patterns which obscure or climinate dis: 
criminable episodes of shock reduction following ab- 
sence of responding. Fach of these practices can 
confuse the outcome of a particular study and cloud 
understanding of the basic behavior-generating effects 
of aversive stimulation. 
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IS 


Stimulus Control 


and Inhibitory Processes’ 


OVERVIEW 


A reinforcer never occurs in a vacuum, isolated 
from outside influences. Environmental events, the 
stemuli of stimulus control, are present before, during, 
and after the occurrence of the reinforcer. The rapid 
solution of major substantive problems in the area of 
stimulus control is due in large measure to the ele- 
gance of techniques for reliably assessing the control 
of behavior by these environmental events. These 
techniques are considered in the following section. 

‘The third section considers the influence of a num- 
ber of specific factors on the characteristics of the 
empirically obtained stimulus-generalization gradient. 
Discussed are the effects of the schedule of reinforce- 
ment during training and a microanalysis of the 
generalization gradient in terms of interresponse times. 


* Preparation of this chapter and the author’s research was 
supported in part by NIMH grant No. 5 RO] MH 18342. I thank 
the many graduate and undergraduate students at Michigan 
State University for their comments on an earlier version of 
this chapter. I am especially indebted to Vern Honig, Tom 
Kodera, John Staddon, Herb Terrace, Dave Thomas, Stan Weiss, 
and many other colleagues for their constructive comments and 
guidance in our pursuit of a clearer understanding of stimulus 
control. 
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Discrimination training is one of the most powerful 
determinants of stimulus control. The fourth section 
defines the various types of discrimination training 
and gives particular attention to the measurement 
and interpretation of inhibitory phenomena. Spence’s 
theory of discrimination learning is reexamined in the 
context of recent evidence on the impact of discrimi- 
nation training on stimuli varying within a single 
dimension on the stimulus-generalization gradient. In 
addition, the newer dynamic models of stimulus con- 
trol are considered as they apply to inhibitory phe- 
nomena. 

The basic assumption in the fifth section is that 
the determinants of inhibitory phenomena are inde- 
pendent of whether the discriminative stimuli are 
selected from a single or two independent dimensions. 
This section examines the effect on inhibitory phe- 
nomena of amount of training, sequence of the dis- 
criminative stimuli, and schedule of reinforcement. 
The joint effect of the response rate and incentive 
differences between the two components of the multi- 
ple schedule on the shape of the generalization gradi- 
ent is also considered. 

The phenomenon of errorless learning has played 
a major role in theories of stimulus control. It refers 
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to discriminations in which the rate of responding 
during (S~—) is negligible from the first session of dis- 
crimination training. Errorless learning was consid- 
ered an exception to the basic laws of discrimination 
learning in that S~ was apparently neutral, rather 
than acquiring inhibitory properties during errorless 
discrimination training. In the sixth section data are 
presented which indicate that basic inhibitory phe- 
nomena are in fact obtained following errorless learn- 


ing. 


THE DEFINITION AND MEASUREMENT 
OF STIMULUS CONTROL 


The Yocakbulary ef Stimulus Control 


STIMULUS CONTROL 


Stimulus contre) is observed when a change in a 
particular property of a stimulus produces a change in 
some response characteristic, as in the rate or prob- 
ability with which a response occurs. For example, the 
onset of a light is said to control behavior if respond- 
ing occurs at a higher (or lower) rate in the presence 
of the light than in its absence. 

The rationale for introducing the new term stim- 
ulus control stems from the semantic confusion which 
Brown (1965) noted as existing between the terms 
discrimination and generalization. To illustrate this, 
suppose different rates of responding to a red stimulus 
and to a green stimulus are established. Then the 
data can be described as indicating a discrimination 
between red and green—or, with the logically equiva- 
lent statement, as indicating a failure to generalize 
between red and green. Therefore, theoretical at- 
tempts to explain generalization as a failure to dis- 
criminate, or discrimination as a failure to generalize, 
may involve the fallacy of using different words to 
describe the same behavioral process. This problem is 
avoided when discrimination and generalization are 
defined as opposite ends of the single continuum of 
stimulus control. 

Nondifferential and differential reinforcement are 
two training procedures which are frequently em- 
ployed in experiments on stimulus control. In non- 
differential reinforcement, a response is equally rein- 
forced in the presence of all the stimuli in the 
environment, so that the consequences of responding 
remain identical independent of stimulus change. 
This procedure is typically employed to obtain base 
line levels of responding to stimuli that subsequently 
are differentially reinforced. Differential reinforce- 
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ment is a class of training procedures in which 
responses are unequally reinforced. When differential 
reinforcement depends upon the stimuli in the en- 
vironment, as in most experiments in stimulus con- 
trol, the procedure is called discrimination training. 
In discrimination training, certain stimuli predict 
occasions when a class of responses is reinforced and 
other stimuli predict occasions when those responses 
are not reinforced or when they are reinforced ac- 
cording to a different schedule. Stimuli that are cor- 
related with pericds of reinforcement are often desig- 
nated positive stimuli (St), while those stimuli which 
are correlated with periods of extinction are desig- 
nated negative stimuli (S—), 


‘THR STIMULUS-GENERALIZATION 
GraApvpIENT 


At the completion of discrimination training, dif- 
ferent rates of responding are typically associatéd with 
éach stimulus, However, it 1s net apparent hew the 
organism will respond to test stimuli that have not 
been previously presented. It is not clear to what 
properties of the stimulus the organism is responding, 
To answer these questions, the stimulus may be 
varied along its various dimensions. For example, in 
a discrimination between red and green, dimensions 
such as the size of the stimulus, its luminance, and its 
wavelength may have acquired control over respond- 
ing. The dimension of generalization is the con. 
tinuum along which a particular property of a stim- 
ulus is varicd during a test for stimulus control. 
Physical dimensions such as the wavelength of 4 light, 
the frequency or intensity of a tone, or line orienta- 
tion are usually celected. A stimulus-ponoevalisalion 
gradient is the function obtained when the total num- 
ber of responses to each of the ctimulue values pre- 
sented during the generalization test are pleted 
against the dimension of generalization. 


Techniques for Obtaining a 
Stimulus-generalization Gradiant 


A stimulus-generalization gradient is employed to 
determine the properties of the stimuli that have 
acquired control over responding, Responding may 
decrease, increase, or remain unchanged during a 
stimulus-generalization test. A variety of procedures 
have been developed for obtaining stimulus-general- 
ization gradients. No single procedure is appropriate 
for all experimental problems, and the shape of the 
gradient depends upon the procedure employed. The 
advantages and disadvantages of each procedure are 
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detailed in the sections that follow. Note that in each 
of these procedures, specific training is usually given 
with respect to only one or two stimuli on a dimen- 
sion prior to the stimulus-generalization test. Conse- 
quently, most of the stimuli presented during the 
generalization test are novel. 


"TRANSIENT OR EXTINCTION METHODS 


In the single-stumulus method, a response is rein- 
forced in the presence of one stimulus. In a subse- 
quent test, extinction to a single test stimulus occurs. 
A separate, independent group is required for each 
data point on the stimulus-generalization gradient, 
which is obtained by averaging the total number of 
responses during extinction for each animal in the 
group. While single-stimulus tests can be used with 
operant methods, and have indeed been studied by 
Hiss and Thomas (1963), the advantages of presenting 
all test values to each individual subject are so great 
that this method has been used almost exclusively. 

Skinner developed the multiple-stimulus method 
for assessing generalization which was first reported 
in 1950. In a report prepared in 1944 but not pub- 
lished until 1965, Skinner described a precursor of 
the most common method for obtaining generaliza- 
tion gradients with operant methods. The classic 
study reported by Guttman and Kalish (1956) in- 
corporated many aspects of Skinner’s procedure and 
déterminéd the direction of subsequent research in 
stimulus generalization. 

Guttman and Kalish selected the visual spectrum 
as the stimulus dimension to exploit the excellent 
color vision of pigeons. During training, the response 
key was illuminated with a monochromatic light 
source. Responses on the key were reinforced with 
food on a variable-interval (VI) 1-min schedule of 
reinforcement. After a substantial rate of responding 
was established, generalization testing was carried out 
during extinction in a session which began with 
several reinforcements to the training stimulus. Dur- 
ing the generalization test, 11 different-colored 
stimuli, including the training stimulus, were ran- 
domly presented on the key each for 60 sec. Each 
stimulus was repeated 12 times within the test in an 
attempt to average out the differences due to the slow 
decrease in the response rate produced by extinction. 
After the first generalization test, the birds were re- 
trained with reinforcement and a second generaliza- 
tion test identical to the first was administered. 

When Guttman and Kalish administered this 
second generalization test, the generalization gradient 
from the second test showed fewer total responses 
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than the gradient obtained from the first test. This 
illustrates a major disadvantage of the extinction 
methods: the measurement of generalization is con- 
taminated by the effects of extinction. Optimal assess- 
ment of stimulus generalization requires comparison 
with a behavioral base line which is both stable and 
recoverable. Unfortunately, responding during extinc- 
tion lacks both of these characteristics. The rate of 
responding decreases, eventually to zero, and repeated 
exposures to extinction reduce the number of re- 
sponses obtained. 

The advantage of the multiple-stimulus method is 
that a stimulus-generalization gradient is obtained 
from a single organism and averaging of data from 
different organisms is not required. 


MAINTAINED GENERALIZATION METHODS 


Procedures in which a constant rate of responding 
is maintained by reinforcement are generally pre- 
ferred to transient or transition procedures in which 
the rate of responding is changing, as described above. 
Several investigators (D. Blough, 1969, 1975; P, Blough, 
1972; Malott, Malott & Glenn, 1973; Pierrel, 1958) 
have employed maintained generalization procedures. 
The session is diyided into trials of generally short 
duration—e.g., 20 sec. On training trials responding is 
reinforced intermittently in order to maintain a base 
line rate of responding, On test trials, responding is 
never reinforced, and a generalization gradient is ob- 
tained during each session by presenting the test 
stimuli in random order. As long as the animal fails 
to discriminate test from training trials, responding 
occurs during the test trials even though reinforce- 
ment never occurs. As D. Blough (1969) has shown, 
the technique is extremely powerful, since hundreds 
of generalization gradients can be obtained from the 
same animal over a period of many months. A typical 
finding with the maintained procedure (P. Blough, 
1972) is that the stimulus-generalization gradient 
around the training stimulus gradually becomes 
sharper within the sensory limits of the organism. 
This method is very useful when closely spaced 
stimuli are used in the test. 

A disadvantage of maintained procedures is that 
several weeks of pretraining are required to establish 
a constant rate of responding on the base line sched- 
ule of reinforcement before a stimulus-generalization 
gradient can be obtained. Another disadvantage is 
that when the test stimuli are spaced far apart, a dis- 
crimination between training and test stimuli is ac- 
quired and responding to the test stimuli rapidly falls 
to zero. In this latter case, the extinction technique 
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may be more appropriate. Whether the maintained 
procedures will eventually replace the transient pro- 
cedures of obtaining the gradient during extinction 
remains to be seen. 


SIMULTANEOUS OR CONCURRENT 
METHODS 


The methods just described are appropriate for 
assessing stimulus control in successive discrimina- 
tions. Stimulus generalization is also measured in 
simultaneous discriminations. In a simultaneous dis- 
crimination two stimuli are presented to the organism 
at the same time: $+, correlated with reinforcement, 
and S~, correlated with extinction. When a discrete- 
trial procedure is employed, each response to St is 
reinforced and each response to S— (or error) produces 
a time-out during which the onset of the next trial is 
delayed. When responses at twe diferent locations are 
intermittently reinforced in the presence of two differ- 
ent stimuli, the procedure is called a concurrent 
schedule of reinforcement. During a concurrent gen- 
evalization test, a variety of test stimuli are presented 
at two locations during extinction, and the number 
of responses to each stimulus which occur at each 
location are recorded, As employed by D. Blough 
(1973) and Catania, Silverman, and Stubbs (1974), this 
procedure produces two stimulus-generalization grad- 
ients, one for each stimulus location. 

A second procedure, described by Honig, Beale, 
Seraganian, Lander, and Muir (1972), employs a con- 
current schedule with an explicit changeover or ad- 
vance response, Two discriminative stimuli, $+ and 
S~, are presented alternately at one location so that 
only one stimulus is present at a time. A second re- 
sponse, at a different location, terminates the current 
stimulus and produces the next stimulus in a predict- 
able series. ‘The advantage of this procedure is that it 
expands the range of dependent variables to include 
the tzme as well as the number of responses in the 
presence of each stimulus (or class of stimuli). This 
procedure has been employed by Honig et al. and 
Beale and Winton (1970) to measure generalization of 
the pigeon’s response of terminating a stimulus as- 
sociated with extinction. 


The Analysis of Data from a Generalization Test: 


Absolute vs. Relative Gradients 


A generalization gradient based on the total num- 
ber of responses obtained during extinction is called 
an absolute generalization gradient. An absolute grad- 
ient is the simplest method of presenting generaliza- 
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tion data. However, some important experimental 
questions require a comparison of different conditions 
and cannot be answered with absolute gradients. Sup- 
pose that an experimenter wants to compare the 
amount of generalization produced by two conditions 
which produce an extreme difference in the total 
number of responses during extinction. For instance, 
two different schedules of reinforcement may produce 
drastic differences in the rate of responding during 
extinction for two different groups of subjects. A rela- 
tive stimulus-generalization gradient can be used to 
compare conditions in such cases, and also when indi- 
vidual absolute gradients differ substantially in their 
mean rates. 

In a relative gradient, the number of responses to 
each test stimulus is expressed as a percentage of the 
total responses to all stimuli. Relative gradients sare 
also sometimés plotted as a propottion of responses 
made to the training stimulus, When relative grad- 
ignts are averaged, equal weight is given to cach 
gradient. In constructing a relative gradient, the 
experimenter accumes that a given absolute decrement 
1g psychologically gréater against a base line of low 
respending to the training valuc than against a high 
base line. Comparisons can be made between the 
slopes of relative gradients since the gradients have 
been equated for the differences in the number of re- 
sponses obtained during the generalization test. How: 
ever, as Morgan (1969) points out, conclusions about 
slopes and differences arg safest when absolute as well 
as relative gradients intersect. 


Control for Stimvlus Preferences 


Suppose an experimentally naive pigeon is placed 
i an experimental chamber which contains a key 
which can be illuminated with various stimuli. The 
bird’s responses on the key are not reinforced, but 
various stimuli are projected on the key and the 
number of responses to each stimulus are recorded. 
Most investigators assume that such a procedure will 
produce a flat gradient with few or no responses to 
each stimulus and conclude that no stimulus prefer- 
ences are present. ‘his conclusion is probably incor- 
rect. Some species exhibit marked preferences for 
certain stimuli which are determined by hereditary 
and developmental rather than by reinforcement vari- 
ables. If these effects are not considered, the results of 
a stimulus-generalization test may be misinterpreted. 
For example, by measuring the unconditioned peck- 
ing behavior of newly hatched gull chicks which were 
presented with various monochromatic stimuli, Hail- 
man (1969) obtained a preference function resembling 
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a stimulus-generalization gradient. Several reviewers— 
e.g., Hinde & Hinde (1973); Seligman & Hager (1972); 
Shettleworth (1972)—have stressed that the experi- 
menter must consider the constraints which the 
organism’s heredity, anatomy, and development im- 
pose upon the behavior of an organism in a learning 
experiment. 


SOME DETERMINANTS OF 
GENERALIZATION GRADIENTS 


Nondifferential Reinforcement 


A VI schedule of positive reinforcement is often 
employed to sustain a moderate rate of responding 
prior to the generalization test in each of the tech- 
niques for measuring stimulus generalization. In these 
techniques the schedule of reinforcement is usually 
held constant at VI 1-min, at least when pigeons serve 
as subjects. However, by manipulating the schedule 
of reinforcement in effect prior to the generalization 
test, Hearst and his colleagues discovered that the 
schedule of reinforcement is one of the most potent 
determinants of the slope of a stimulus-generalization 
gradient. Different schedules of reinforcement pro- 
duce widely divergent absolute gradients because they 
differ in their resistance to extinction. Therefore, rela- 
tive gradients are employed to compare the effects of 
schedule of reinforcement on the slope of the general- 
ization gradient, 

Hearst, Koresko, and Poppen (1964) found that 
relative gradients obtained after differential reinforce- 
ment of low rate (DRL) training were much flatter 
than gradients obtained after VI training. On a DRL 
schedule an interresponse time greater than ¢ sec pro- 
duces reinforcement, while an interresponse time less 
than ¢ sec is extinguished. In a second experiment, 
Hearst, Koresko, and Poppen (1964) trained each 
sroup of animals on a different value of a VI sched- 
ule. ‘The longer the mean value of a VI schedule, the 
lower the response rate and frequency of reinforce- 
ment. Generalization was measured during extinction 
with the multiple-stimulus method. Figure 1 shows 
that a VI 4-min schedule produces a rather flat relative 
gradient, indicating that more generalization is ob- 
served with long VI schedules than with the short VI 
I-min schedule which is usually employed in general- 
ization experiments, ‘These data demonstrate that the 
slope of a stimulus-generalization gradient can be 
drastically altered by manipulating the temporal dis- 
tribution of food deliveries while holding stimulus 
variables constant. 
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RELATIVE GENERALIZATION 


-90 -45 O #45 +90 
DEGREES FROM S+ 


Fig. 1. Gradients of relative generalization for five groups of 
pigeons in which each group received training on a different 
value of a VI schedule prior to the generalization test. The S+ 
was a vertical line (0°) for all subjects. In general, the gradient 
becomes flatter as the value of the VI schedule increases. (From 
Hearst, Koresko, & Poppen, 1964. © 1964 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


Thomas and Switalski (1966) compared stimulus 
generalization following variable-ratio (VR) and VI 
training. In order to equate the two schedules for the 
frequency and pattern of reinforcement, pairs of 
pigeons were matched through a yoking procedure. 
The time required by a pigeon on VR training to 
complete each ratio determined the interval at which 
its yoked pigeon on VI training was reinforced. Thus 
when one pigeon’s response was reinforced on the VR 
schedule, the next response of the VI bird was also 
reinforced. ‘The VR schedule generated a higher rate 
of responding than the VI schedule, but the gradient 
for the VR group was slightly flatter than the VI 
gradient. 

What is the explanation for these results? A simple 
explanation is that each response is determined to 
some extent by previous responses (factor A) and to 
some extent by external stimuli (factor B), where 
A+ B = 1 (ie., complete determination). When fac- 
tor A is important, as on DRL and (perhaps) ratio 
schedules, then factor B is correspondingly less so; 
hence the flatter gradients. While this hypothesis is an 
attractive device for integrating data on the effects of 
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schedules of reinforcement on stimulus generalization, 
the explanation remains post hoc until a method is 
developed for measuring the extent to which a re- 
sponse is determined by previous responses. 


Microstructure of the Stimulus-generalization 
Gradient 


This subsection provides a review of the summa- 
tion or averaging procedures that are employed on 
the data obtained during a stimulus-generalization 
test. "The basic question is whether all responses are 
equivalent. The question has been divided into three 
subsidiary questions which will be considered in turn: 
(1) Does the shape of the gradient depend upon the 
amount of time which has preceded the response? (2) 
Does the shape of the gradient change during the test 
session? (3) Is the gradient an artifact of inappro- 
priate averaging of responses of different topog- 
raphiesr 


{RT ANALYSIS OF THE 
STIMULUS-GENERALIZATION GRADIENT 


An interresponse ime (IRT) analysis of the rate of 
responding is a useful technique for determining the 
essential characteristics of stimulus control. One of 
D. Blough’s (1969) experiments neatly illustrates the 
contribution of responses within various IR'T cate- 
gories to the shape of the stimulus-generalization 
gradient. Pigeons were intermittently reinforced for 
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responses to a 582-nm stimulus, and the data for gen- 
eralization were obtained with a maintained pro- 
cedure by randomly presenting a series of adjacent 
wavelengths during extinction. 

Three stimulus-generalization gradients were ob- 
tained by dividing the number of responses into four 
IRT categories or class intervals: .6 to 1.0 sec, 1.0 to 
2.0 sec, 2.0 to 4.0 sec, and greater than 4.0 sec. Few 
responses occurred between 0 and .6 sec because the 
key was darkened during this period to provide stim- 
ulus feedback for each response. The dependent vari- 
able, IRTs/OP was the conditional probability that a 
response fell within one of the four IRT categories. 

Figure 2 clearly indicates that the stimulus on the 
key acquires control over the pecking response only 
within an IRT range of 2.0-4.0 sec. The figure shows 
q fairly flat pradient aid complete generalization for 
IRTs within the .6-1,0-se¢ and 1,0-2.0-sec categories. 
In other words, the stimulus on the key does not con- 
trol the rate of responding when an animal responds 
with IRTs less than 2.0 see. 

Similar data hawe been obtained with the lever- 
pressing response of rats by Grites, Harris, Rosenquist, 
and Thomas (1967). However, an experiment by 
White (1973) has restricted the generality of this 
phenomenon. Stimulus control over the responding 
of pigeons was acquired by responses in all IRD class 
intervals, including those of less than 2.0 sec, White's 
procedure differed from Blough’s in a number of sig- 
nificant respects. Ihe generalization test in White’s 
experiment was preceded by differential reinforce: 
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Fig. 2. Maintained gradients 
obtained following reinforce: 
ment of responding at 362 nM: 
showing the probability of a 
response as 4 function of wave- 
length and interresponse time 
(RT). The numbers nevt ts 
each function indicate the IRT 
Class intervals in sec. The over- 
all level of the curves varies 
with the number of responses 
in the class interval. The sig- 
nificant aspect is that the gradi- 
ents with the class interval be- 
tween 2 and 4 sec is steeper 
than the flat gradients obtained 
584 586 between .6 and 1 sec. (From D. 
S+ Blough, 1969. © 1969 by the 
Society for the Experimental 
Analysis of Behavior, Inc.) 
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ment, while Blough employed nondifferential rein- 
forcement. Additional research is necessary to specify 
the conditions under which stimulus control: is ac- 
quired or is not acquired over responses following 
various IRTS. The data base of this research should 
be broadened by employing responses other than 
pecking. For example, Hemmes (1973) has shown that 
pigeons acquire a discrimination between a red or a 
white houselight located above the ceiling of the 
experimental chamber when responses on a_ foot 
treadle located on the floor of the chamber are rein- 
forced with food. It would be interesting to know if 
stimulus control over the treadle response depends 
upon the duration of the preceding IRT. 

Blough’s experiments, like those discussed in the 
preceding section, imply that the sensitivity of a re- 
sponse to stimulus variation depends upon the 
amount of time which has elapsed since the preceding 
response. 


STEEPENING OF THE 
STIMULUS-GENERALIZATION 
GRADIENT DuRING EXTINCTION 


In the multiple-stimulus method, each stimulus is 
presented several times in extinction during the gen- 
eralization test. In general, fewer responses are ob- 
tained with each successive presentation of the same 
stimulus, Several experiments (Friedman & Guttman, 
1965: Thomas & Barker, 1964) have demonstrated a 
steepening of the relative stimulus-generalization 
gradients during extinction. Friedman and Guttman 
analyzed the changes in the generalization gradient 
which occurred during testing by dividing the total 
number of responses during extinction into successive 
quarters. A relative generalization gradient was con- 
structed for each of the four quarters. The gradients 
became steeper as extinction progressed because the 
rate of responding dropped to zero more rapidly for 
the stimuli which were remote from the training stim- 
ulus while responding to the training stimulus de- 
creased less rapidly. Thus a long generalization test 
in which each stimulus is presented many times is 
biased toward a steep gradient, while a short general- 
ization test with few stimulus presentations is biased 
toward a flat gradient. While the magnitude of the 
bias is small, caution is required in interpreting differ- 
ences in the slopes of the gradients from different 
experiments because the differences could be due to 
comparing generalization test sessions of different 
lengths. 

Why does the generalization gradient become 
steeper as extinction progresses? A plausible explana- 
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tion is based on an IRT analysis. When the rate of 
responding is high, short IRTs predominate, while 
long IR'T’s emerge when the rate of responding is low. 
Since the rate of responding slows during extinction, 
IRTs become longer. Blough’s data demonstrated 
that responses preceded by long IRTs may acquire 
more stimulus control than do responses preceded by 
short IRTs. Since long IRTs predominate toward the 
end of a stimulus-generalization test carried out 
during extinction, the gradient should become 
steeper. 


‘THE STIMULUS-GENERALIZATION 
GRADIENT: FACT OR ARTIFACT? 


Several investigators (Migler, 1964; Migler & Mil- 
lenson, 1969; Ray & Sidman, 1970; Stoddard & Sid- 
man, 1967, 1971) view the stimulus-generalization 
gradient as a continuous function, consisting of vary- 
ing proportions of discrete elements. ‘The proportions 
can vary continuously, but the elements (IRT’s or 
whatever) are discrete or “quantal.’’ Consider the re- 
duced number of responses to an intermediate test 
stimulus. The same number of responses could be 
produced by a constant, intermediate rate of respond- 
ing or by averaging brief periods of responding at 
the previously reinforced rate with long periods con- 
taining few or no responses. ‘These investigators em- 
ployed simultaneous methods for assessing stimulus 
control following discrimination training in which 
responses at more than one location were reinforced. 
In general, the test stimuli controlled the relative fre- 
quency of the two responses that were reinforced dur- 
ing training so that a mixture of the two responses 
was obtained at intermediate test stimuli, Mixing of 
different responses is likely to occur during stimulus 
generalization when two incompatible responses have 
been reinforced during simultaneous discrimination 
training prior to the generalization test, so simultane- 
ous methods are well suited to the measurement of 
competing responses during generalization. 

Collins (1974) obtained IRT distributions of the 
pecking response of pigeons while generalization was 
measured with the multiple-stimulus method during 
extinction. Following single-stimulus training in 
which responding to a 554-nm stimulus was reinforced 
on a VI schedule, the number of responses in the 
longer IRT class intervals (>6 sec) increased system- 
atically with divergence from $+, while the frequency 
of responses with short IRTs decreased. Following 
successive discrimination training between two stim- 
uli, the IRT distribution for an intermediate test 
stimulus was a mixture, in varying proportions, of the 
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response patterns conditioned in the presence of S+ 
and S-. 

Using rats as subjects, Weiss (1972b) measured IRT 
distributions in the presence of a light associated with 
a DRL schedule and a tone associated with a VR 
schedule. ‘The absence of the light or tone was asso- 
ciated with extinction. After a low rate of responding 
was established on the DRL schedule and a high rate 
of responding was established on the VR schedule, the 
tone and light were presented simultaneously as a 
compound stimulus during extinction. An IRT anal- 
ysis of responding during the compound stimulus re- 
vealed few patterns of responding during the com- 
pound stimulus that were not present during the 
individual presentations of the light and the tone. 

All of these experiments, which employed a wide 
variety of methods for assessing stimulus control, are 
in agreement that the presentation of an intermediate 
test stimulus following discrimination training be- 
tween two sumuli does net produce a constant, inter- 
mediate rate of responding. A significant component 
of the original behaviors which were conditioned dur- 
ing training remains during generalization testing. 
Theretore, as Weiss (1972b) points out, the stimulus- 
generalization gradient is probably a product of the 
mixing of a small number of response classes. The 
task of the microanalysis of stimulus control is to 
determine and isolate the variables responsible for the 
mixture of responses which result in the stimulus- 
generalization gradient, 

A microanalysis of the generalization gradient in 
terms of interresponse times or competing responses 1s 
compatible with research whose goal is to determine 
the effects of various variables upon the shape of the 
stimulus-generalization gradient. It is to this body of 
research that attention is now directed. 


INFLUENCE OF DISCRIMINATION 
TRAINING ON THE GENERALIZATION 
GRADIENT 


One of the main problems in the discrimination 
learning of animals is to specify the conditions under 
which a change in a stimulus produces a change in 
the probability with which a response occurs. A gen- 
eral finding (see the reviews by Thomas, 1969, 1970) 
is that nondifferential reinforcement produces a flat- 
ter stimulus-generalization gradient than does differ- 
ential reinforcement. Discrimination training is one 
of the most effective procedures for increasing the 
slope of the stimulus-generalization gradient. 
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Two Types of Discrimination Training 


The term discrimination training is very broad, 
since there are many different procedures which share 
the defining characteristic of an S+ which is corre- 
lated with reinforcement and an S~ which is cor- 
related with extinction. Switalsky, Lyons, and Thomas 
(1966) have developed useful terminology for classify- 
ing the types of discrimination training on the basis 
of the relationship between the discriminative stimuli 
(S+ and S~) and the stimulus dimension on which the 
generalization gradient is obtained. 

The first type is intradimensional training, which 
occurs when St+ and S~ are selected from the same 
stimulus dimension and a generalization test is carried 
out within that dimension. A common example ot 
intradimensional training 1s a successive discrimina- 
tion in which responding is reinforced in the presence 
of one wavelength, $+, and extinguished in the 
presence of another wavelength, S~. Other dimen- 
sions frequently used are the frequency Of AN ACOUSEIE 
stimulus aiid oriéntation of a line. 

An important characteristic of intradimensional 
training is that it is impossible to vary the psycholog- 
ical distance of 4 test stimulus from §S* withetit also 
varying its psychological distance from 5—. Therefore, 
intradimensional training is employed when the ex- 
perimenter wants to study the interaction between 
reinforcement at §+ and extinction at S— on respond- 
ing to each stimulus. 

The second type of discrimination training is inter- 
dimensional training which occurs when S+ is equally 
distant psychologically from each of the stimuli on the 
S— dimension or when $— is equally distant from cach 
of the stimuli on the St dimension. When two dimen- 
siéfis aré psychologically independent, sash stimulup 
from the 5+ dimension is equally distant psycholog- 
ically from each of the stimulus on the $— dimension. 
Interdimensional training 15 smpleysd when the ex 
perimenter wants to compare responding to stimuli 
similar to S= with responding to stimuli similar to 
3+, under conditions where the two kinds of respond. 
ing are assumed to be independent. Two separate 
generalization gradients for the St and S— dimensions 
may be obtained with interdimensional training, 
thereby avoiding the interaction obtained with intra- 
dimensional training. This rationale for interdimen- 
sional training was stated by Jenkins (1965) and is 
described in greater detail in monographs by Hearst, 
Besley, and Farthing (1970) and Hearst (1972). 

Silence, the absence of S+ or S-, is a stimulus at 
the end point of the intensity (loudness) dimension 
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which is often employed in interdimensional training. 
For example, suppose that silence is the stimulus that 
is employed as S~ and a tone of 1,000 Hz is employed 
as St. If intensity and frequency are independent 
dimensions, then the frequency of the tone may be 
varied during a stimulus generalization test without 
changing the psychological distance between the test 
values from the $+ dimension and S-. 

The stimuli for the early experiments in inter- 
dimensional discrimination training were a priori as- 
sumed to be psychologically independent dimensions. 
However, psychological independence is an empirical 
concept, and a preliminary experiment is required to 
demonstrate the stimuli that appear independent to 
the experimenter are also functionally independent, 
as indicated by the behavior of the organism. A test 
for psychological independence is conducted as fol- 
lows. First a response is reinforced in the presence of 
a stimulus from the A-dimension without the presen- 
tation of a stimulus frorn the B-dimension. Then a 
stimulus-generalization gradient is obtained by pre- 
senting test stimuli from the B-dimension. If a hori- 
zontal gradient is obtained, the stimulus from the A- 
dimension is independent of the B-dimension. The 
conyerse experiment could be carried out by first 
training with stimulus B and then testing on the A- 
dimension to determine if stimulus B is independent 
of the A-dimension, 

Giurintano, Schadler, and Thomas (1972) condi- 
tioned the pecking response of different groups of 
pigeons in the presence of stimuli that are a priori 
independent of the dimension of line orientation: a 
white light, a green light, and a white dot. Then each 
group was given a stimulus-generalization test in 
which the angle of a white line on a dark background 
was varied, A preexperimental preference for a par- 
ticular orientation was not obtained. Training with 
the white or dim light produced a preference for 
vertical, while training with the dot produced a pref- 
erence for 30°. Only the green stimulus resulted in no 
preferred orientation and therefore was functionally 
orthogonal to the angle of the line. Using a similar 
procedure, Selekman (1973) conditioned pecking in 
the presence of a white key and obtained a nonhori- 
zontal gradient on the wavelength dimension. The 
pigeons demonstrated a preference for short wave- 
lengths between 510 and 560 nm. These data demon- 
strate that an experiment is necessary to determine if 
the two dimensions employed in interdimensional 
training are functionally independent. Complete func- 
tional independence is probably an ideal state, and it 
is unlikely that any two dimensions are completely 
independent. By determining an organism’s prefer- 
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ence for certain stimuli, the experimenter may con- 
sider this response bias in selecting the most appro- 
priate stimuli for interdimensional training. Thus, for 
example, the experimenter may deliberately design 
the experiment to counteract the anticipated bias in 
order to make such an outcome more convincing. 


Comparing Nondifferential with 


Differential Reinforcement 


‘The effects of nondifferential and differential train- 
ing procedures on stimulus generalization are illus- 
trated by a widely cited experiment of Jenkins and 
Harrison (1960). The data from a subsequent experi- 
ment of Jenkins and Harrison (1962) extend these 
results comparing the effects of interdimensional and 
intradimensional training on the slope of the stim- 
ulus-generalization gradient. In their first experiment, 
one group of pigeons was given nondifferential train- 
ing in which a 1,000-Hz tone signaled that a VI sched- 
ule of reinforcement was in effect. No training stim- 
ulus explicitly correlated with extinction was intro- 
duced until the generalization test. A second group 
was given interdimensional training in which $+ was 
a 1,000-Hz tone and S~ was silence. The tone and 
silence were randomly presented so that the animals 
learned to respond in the presence of the tone and not 
to respond in the absence of the tone. 

The individual gradients for a representative bird 
from each group which received each type of training 
are shown in Figure 3. During generalization, the test 
stimuli were tones widely separated in frequency from 
the training stimulus, as well as silence. The relative 
gradient following nondifferential training demon- 
strated weak stimulus control, since the gradient was 
relatively flat with a maximum at St. 

In contrast with nondifferential training, inter- 
dimensional training produced a steeper gradient 
with a clear maximum at 1,000 Hz and over 20% of 
the total responses occurring to the training stimulus 
for each animal. Following interdimensional training, 
the frequency of the tone during the test system- 
atically controlled the rate of pecking. Why does the 
frequency of the tone acquire control over responding 
when the only source of discrimination training is be- 
tween the presence and absence of the tone? Clearly, 
differential reinforcement is more effective than non- 
differential reinforcement in activating stimulus con- 
trol, but the question has no adequate answer. 

In a second experiment, Jenkins and Harrison 
(1962) obtained much steeper generalization gradients 
when several of the birds were given additional intra- 
dimensional training between two closely spaced tones 
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Fig. 3. Representative individual generalization gradients of 
total frequency obtained from pigeons following three training 
conditions. The open circles show a relatively flat gradient 
obtained after nondifferential reinforcement in the presence 
of a 1,000-Hz tone. The closed circles show that a steeper 
eradient was obtained following interdimensiona]l training in 
which 5* was a 1,000-Hz tone and S~ was no tone. The open 
triangles show that the steepest gradient was obtained following 
intradimensional training in which $* was a 1,000-Hz tone and 
S- a 950-Hz tone. (Adapted from Jenkins & Harrison, 1960. © 
1960 by the American Psychological Association, and reprinted 
by permission. And Jenkins & Harrison, 1962. © 1962 by the 
Society for the Experimental Analysis of Behavior, Ing.) 


rather than between a tone and the absence of the 
tone as in their original experiment. Notice in Figure 
3 that following intradimensional training between an 
S+ of 1,000 Hz and an S— of 950 Hz, the maximum 
did not occur at $+, but occurred instead at 1,050 Hz. 
This phenomenon is called the peak shift and is the 
general result of intradimensional training. In the 
peak shift, the peak or mode of the generalization 
gradient occurs at a test stimulus which is displaced 
from S+ in a direction away from S~. The significance 
of this important phenomenon is discussed later. 

In an earlier review of the Jenkins and Harrison 
experiment, Terrace (1966c) concluded that “differ- 
ential remforcement was necessary to establish stim- 
ulus control” (p. 281). However, this conclusion was 
not correct since Jenkins and Harrison obtained weak 
stimulus control following nondifferential reinforce- 
ment. In addition, Thomas and Setzer (1972) showed 
reliable auditory frequency generalization gradients 
in both rats and guinea pigs following nondifferential 
reinforcement. The finding that auditory stimuli ac- 
quire good stimulus control during nondifferential 
reinforcement with food for rats and guinea pigs and 
poor control for pigeons may reflect a difference be- 
tween either the ontogeny or the inherited propen- 
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sities of the subjects in these experiments. Therefore, 
the effect of nondifferential reinforcement on stimulus 
control depends upon the stimulus dimension, rein- 
forcer, and species of the subject. (See Mackintosh, 
chapter 16 in this volume.) 

Using pigeons and visual stimuli, Switalski, Lyons, 
and Thomas (1966) and Lyons and Thomas (1967) 
compared the effects of interdimensional nondifferen- 
tial reinforcement and interdimensional differential 
reinforcement on the slope of the stimulus-generaliza- 
tion gradient. In the Lyons and Thomas study, the 
stimuli were a white vertical line on a black back- 
ground and a 555-nm light. When responding to both 
stimuli was equally reinforced, the gradients for wave- 
length were flattened. Following interdimensional 
training between 555 nm (S+) and the vertical line 
(S—), the gradients for wavelength were stcepened. 

Interdimensional training does not guarantee stim- 
ulus control over the behavior of the organism by the 
dimension selected by the experimenter, but it is more 
effective than nondifferential reinfercement. When 
compared with interdimensional training, intradimen- 
sional training between stimuli which are closely 
spaced on the stimulus dimension further sharpens 
the stimulus-generalization gradient. 


Effects of Intradimensional Training 


‘Tue Posrrive PEAK SHIFT 


The classic peak shitt experiment was perlormed 
by Hanson (1959), Four groups of pigeons were given 
intradimensional training on the wavelength dimen- 
sion with the same $+, 550 nm. The groups differed 
only with respect to the S~ employed: 555, 560, 570, 
and 590 nm. The experimental groups were trained 
on the discrimination until the response rate during 
S— reached zero. The number of sessions required to 
reach this discrimination criterion increased as the 
difference between S+ and S~- decreased. Following 
discrimination training, each animal was given a gen- 
eralization test with stimuli which ranged from 480 to 
620 nm. 

The results for these groups plus a control group 
that received nondifferential reinforcement with S+ 
only are presented in Figure 4. The peak of the 
generalization gradient for the four experimental 
groups did not occur at S+, 550 nm. Rather, the peak 
occurred at 540 nm, a stimulus that the birds had 
never seen until the generalization test. This is the 
positive peak shift, which is usually called simply the 
peak shift. Extrapolation of the gradient obtained 
with S~ at 555 nm suggests that the peak would have 
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Fig. 4. Effects of intradimensional discrimination training on 
the stimulus-generalization gradient for different groups of 
pigeons. For all groups, S* was 550 nm, Four groups received 
discrimination training with S- at 555, 560, 570, or 590 nm, 
respectively, as indicated by the vertical arrows. A control group 
(solid curve) received nondifferential reinforcement at 550 nm 
and showed a maximum at S*. The groups given discrimina- 
tion training showed a positive peak shift with the maximum 
displaced from S+ to 540 nm. (From Hanson, 1959. (© 1959 by 
the American Psychological Association. Reprinted by permis- 
sion.) 


occurred at 535 nm if that test stimulus had been 
presented, and extrapolation of the gradient with S— at 
590 nm suggests a peak shift at 545 nm. Hanson’s data 
suggest that the magnitude of the peak shift depends 
upon the difference between $+ and S—. Most subse- 
quent experiments (see the review by Purtle, 1973, p. 
410) agree that the closer the spacing between S+ and 
S~, the greater the probability of obtaining a peak 
shift. Successful research on the determinants of the 
peak shift requires the inclusion of several test stim- 
uli which are spaced closely to St. 

Figure 4 also shows that the experimental groups 
emitted more responses in the vicinity of S+ than did 
the nondifferential control group. This difference 
may simply reflect the fact that the control group 
received only five days of nondifferential reinforce- 
ment followed by the generalization test, while the 
experimental groups received up to 25 additional days 
of discrimination training prior to the generalization 
test. Alternatively, the increased output in the vicinity 
of S+ may reflect the occurrence of behavioral con- 
trast. When behavioral contrast occurs during dis- 
crimination training, the rate of responding to S+ is 
elevated relative to a base line of nondifferential rein- 
forcement prior to the stimulus-generalization test. 
The higher rate to S+ carries over to the generaliza- 
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tion test and produces a higher rate of responding to 
the stimuli in the vicinity of S+. When behavioral 
contrast is assessed with a control group, the control 
group should receive the same amount of exposure to 
S+ as the experimental group. 

Discrimination training reduced the number of re- 
sponses in the vicinity of S~ as compared with the 
control group. ‘This produced asymmetrical gradients 
with more area on the side of S+ that was away from 
S~. The area shift is an index of a gradient in which 
more than 50% of the area of the gradient lies on the 
side of St away from S—. The area shift is based on 
the assumption that the theoretical excitatory gradi- 
ent around S+ is symmetrical. The area shift is more 
sensitive to the effects of intradimensional training 
than is the peak shift, since some animals which do 
not show a peak shift may still show an area shift 
(Terrace, 1966c, p. 328). However, the peak shift is 
superior to the area shift as a dependent variable. 
Area shifts must be interpreted with caution, since an 
asymmetric gradient may reflect chance variability, 
stimulus preferences, or lack of excitation in the 
vicinity of S~. In Hanson’s experiment the area shift, 
the percentage of responses below 550 nm increased as 
the difference between 5+ and S~ decreased. 

One of the most convincing demonstrations of the 
peak shift was a study by Thomas and Williams 
(1963) which employed an S~ located between two S* 
stimuli. The S- was 560 nm which alternated suc- 
cessively with an S+ of 540 and 580 nm. The postdis- 
crimination gradient showed a double peak shift with 
the two maximums located at 530 and 590 nm. This 
study clearly demonstrates that the peak of the gradi- 
ent shifts away from S—. 

A comprehensive review of the literature on the 
peak shift by Purtle (1973) indicates that the phe- 
nomenon has been obtained following intradimen- 
sional discrimination training on a variety of dimen- 
sions with several species of organisms, including 
humans. Notwithstanding the generality of the phe- 
nomenon, research on the determinants of the peak 
shift is confronted with the problem of individual 
differences. Published articles are filled with restric- 
tive statements such as “only one of the four subjects 
displayed the peak shift” or “three of the five birds 
produced the peak shift,” etc. When the stimuli are 
spaced closely along the stimulus dimension, more 
responses may by chance occur to a test stimulus 
other than S* producing an artifactual peak shift. 
‘The optimal parameters for producing the peak shift 
should be determined. One problem is that the mea- 
surement of the peak shift during extinction with the 
Guttman and Kalish technique may increase the vari- 
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ability of the data. The use of a maintained pro- 
cedure may increase the reliability of the phenomenon. 


‘THE NEGATIVE PEAK SHIFT 


A negative peak shift occurs when the minimum of 
the generalization gradient occurs at a test stimulus 
which 1s displaced from S~ in a direction away from 
S+. Careful examination of Figure 4 shows that sev- 
eral of the experimental groups showed a negative 
peak shift. However, reliable measurement of the 
negative peak shift was difhcult in Hanson’s experi- 
ment due to the low response output for test stimuli 
in the vicinity of S-. 

Guttman (1965) solved the problem of the zero rate 
of responding in the vicinity of S~ by nondifferen- 
tially reinforcing responses to a range of spectral 
stimuli that were subsequently employed in the gen- 
eralization test. This was followed by intradimensional 
discrimination training between St and S~ to a 
criterion less strict than Hanson’s. ‘Then a generaliza- 
tion test was administered during extinction, The 
results of this experiment, presented in Figure 5, show 
both a positive and a negative peak shift. The data in 
Figure 5 represent an average for six animals given 
intradimensional discrimination training on the waye- 
length dimension. The minimum did not occur at 
S—, test stimulus I], but was displaced beyond St 
away from S— to test stimulus 14. This is a negative 
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Fig. 5. A stimulus-generalization gradient illustrating both a 
positive and a negative peak shift. Responding to the test 
stimuli was nondifferentially reinforced. Then the rate of re- 
sponding to S~ was reduced by intradimensional discrimination 
training. Finally generalization was measured by presenting the 
test stimuli during extinction. The maximum did not occur 
at S*, test stimulus 9, but was displaced to test stimulus 8, a 
positive peak shift. The minimum did not occur at S-, test 
stimulus 11, but was displaced to test stimulus 14, a negative 
peak shift. (From Guttman, 1965.) 
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Similarly, the maximum did not occur at St, test 
stimulus 9, but was displaced beyond St+ away from 
S~— to test stimulus 8. 

Stevenson (1966) has also obtained a negative peak 
shift with a method which involves testing over only 
the half of the stimulus dimension centered around 
S~. When stimuli in the vicinity of S+ were not pre- 
sented, the rate of responding in the vicinity of S— 
was much higher than in Hanson’s experiment. The 
advantage of Stevenson’s procedure is that a negative 
peak shift can be obtained after the birds meet a 
strict criterion of no responding to S— without previ- 
ously reinforcing responses to any of the test stimuli 
except 5+, Stevenson’s experiment demonstrates that 
the presentation of St during the generalization test 
is not necessary for the negative peak shift. 

Since most of the research has concentrated on the 
positive peak shift, the negative peak shift has been a 
neglected experimental phenomenon. More research 
is needed to determine the correlation between these 
two phenomena of intradimensional discrimination 
training. The simplest assumption is that the determi- 
nants of the positive and negative peak shifts are 
identical. 


ADDITIVE AND SUPPRESSIVE SUMMATION 


In the experiments en ths positive and negative 
peak shifts, the continuum was defined by a dimen: 
sion of a stimulus. Weiss (1972a) defines a stimulus 
sei in terms of the on and off states of two discrimina- 
tive stimuli; a tone (T) and a light (L). The set ex: 
tends from the all-off extreme (T+ L) through the 
one-stimulus on conditions (T + L) or (T + L) to the 
all-on extreme (T+ L). The preceding peak shift ex- 
periments employed two-component multiple sched- 
ules in which each component was associated with the 
on state of a stimulus. Weiss argues that consideration 
of the rate of responding and conditions of reinforce- 
ment in the off as well as the on states is essential for 
a complete understanding of a stimulus control. 
Therefore, Weiss employs three-component multiple 
schedules in which the third component is associated 
with the all-off extreme of the ordered stimulus set. 

Typically rats are intermittently reinforced for 
pressing a lever in these experiments. Consider a 
three-component multiple schedule in which respond- 
ing is reinforced in the presence of a tone (T +L) 
and a light (T + L), while responses are extinguished 
in the absence of the tone and light (I+ L). Re- 
sponding is maintained at approximately equal rates 
in the tone and the light. After discrimination train- 
ing, stimulus control is assessed in a compounding 
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test in which the tone and the light are presented 
separately and, in addition, the tone-plus-light com- 
pound stimulus is presented for the first time. The 
typical outcome of this type of experiment is additive 
summation, in which more responses are obtained to 
the compound stimulus than to either of the single 
training stimuli alone. 

Weiss (1972a, p. 194) explained additive summa- 
tion as follows. When T+ L is associated with extinc- 
tion, the response which is conditioned is response 
cessation (R,). When responding is reinforced in the 
presence of T+ L, the behavior which is conditioned 
consists of a mixture of the response controlled by the 
tone (Rs) and response cessation (R,) which is con- 
trolled by the absence of the light. Similarly, the 
(T -+ L) stimulus controls a mixture of behavior con- 
trolled by the light (Ry) and response cessation (R,) 
controlled by the absence of the tone. During presen- 
tation of T+ L, additive summation occurs be- 
cause responding consists of a mixture of responses 
controlled by the light (R,) and responses controlled 
by the tone (R;), The mixture of Ry and Rg produces 
a higher rate of responding than the mixture of either 
R, or R, with R,. According to this formulation the 
habits conditioned in T +L are influencing the be- 
havioral control in T + L and L + T through L and 
T, respectively. Therefore, according to Weiss, the 
response rate and reinforcement frequency deter- 
mined by the T+ L contingency sets the condition- 
ing context within which control is acquired by 
T+L and L4 T. 

Now consider a second experiment in which re- 
sponding is reinforced in the absence of the light and 
tone and responding is reinforced at a lower rate or 
punished in the presence of a tone and light. The 
typical outcome of this experiment is called suppres- 
sive summation, in which fewer responses are ob- 
tained to the compound stimulus than to either train- 
ing stimulus alone. Weiss also explained suppressive 
summation in terms of a mixture of responses con- 
trolled by each stimulus element. 

Additive summation is analogous to the positive 
peak shift, since in each case the maximum rate of 
responding is controlled by a stimulus removed from 
S* in a direction away from S—. Similarly, suppressive 
summation is analogous to the negative peak shift. 
Weiss (1971) noted that these phenomena are func- 
tionally similar in the sense that each has the same 
determinants. Weiss (1972a) summarizes the effects of 
conditioning context as follows: 


Compounding the same _ schedule-associated 
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stimuli might lead to additive summation under 
one set of circumstances, suppressive summation 
under another or even averaging in a third 
complex schedule context. ... [This means] 
that absolute properties cannot be attributed to 
the schedule-associated behaviors without due re- 
gard to the total schedule context of which they 
are a part. (p. 206, italics in original) 


Effects of Interdimensional Training 


EXCITATION, INHIBITION, AND 
DIMENSIONAL CONTROL 


Hearst, Besley, and Farthing (1970) define exci- 
tatory and inhibitory stimuli as follows: 


An “excitatory stimulus” is a stimulus ... that 
develops during conditioning . . . the capacity 
to increase response strength above the level 
occurring when that stimulus is absent. An “in- 
hibitory stimulus” is a stimulus that develops 
during conditioning the capacity to decrease re- 
sponse strength below the level occurring when 
that stimulus is absent. (p. 376) 


Consider a hypothetical experiment in which in- 
terdimensional training is employed with S+ cor- 
related with reinforcement and S- correlated with 
extinction. After discrimination training, a test is re- 
quired to determine if S+ has acquired the capacity 
to increase responding, and a separate test is required 
to determine if S~ has acquired the capacity to de- 
crease responding. As the following quotation illus- 
trates, Hearst et al. distinguish these tests from 
procedures for obtaining a stimulus-generalization 
gradient, 


The term “excitatory dimensional control,” in 
our view, would be applied when new stimulus 
values that lie at progressively greater distances 
along a specific dimension from an excitatory 
stimulus show a graded decremental effect. The 
term “inhibitory dimensional control” would be 
applied when new stimulus values at progres- 
sively greater distances from an inhibitory stimu- 
lus show a graded incremental effect on the 
strength of an operant response. It is important 
to point out that an incremental gradient 
around some stimulus value is necessary but not 
sufficient for defining inhibitory dimensional 
control. ‘Ihe specific stimulus at which respond- 
ing is minimal must also be shown to be inhibi- 
tory by some independent test, since it is logi- 
cally possible that such a stimulus is relatively 
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“neutral” and the other values progressively 
more excitatory. (p. 377) 


Stimulus-generalization gradients are classified as 
excitatory or inhibitory on the basis of whether re- 
sponding decreases or increases with increasing dis- 
tance from the training stimulus. An excitatory or 
decremental stimulus-generalization gradient has a 
maximum at or near $+. The number of responses to 
a test stimulus decreases with increasing distance 
from St+. An inhibitory or incremental stimulus- 
generalization gradient has a minimum at or near 
S-. The number of responses to a test stimulus in- 
creases with the distance from S—. Excitatory gradi- 
ents measure generalization of reinforcement while 
inhibitory gradients measure generalization of extinc- 
tion. 


The Measurement of Inhibitory Stimulus Control 


This section describes four techniques that have 
been developed for the measurement of inhibitory 
stimulus control following interdimensional discrimi- 
nation training. As Rescorla (1969a, 1969b) and Hearst 
(1972) have pointed out, a variety of different experi- 
mental operations are used to define a conditioned 
inhibitor. Furthermore, the measurement of condi- 
tioned inhibition is more difficult to measure than ig 
conditioned excitation, thus necessitating special con- 
trol procedures to avoid confounding conditioned 
inhibition with other behavioral processes. Parallel- 
ing the research on conditioned inhibition, a variety 
of methods have emerged for the measurement of in- 
hibitory stimulus control. ‘These include (1) resistance 
to extinction, (2) resistance to reinforcement, (3) com- 
bined cue or summation tests, and (4) stimulus re- 
duction. 


RESISTANCE TO EXTINCTION 


Three groups of investigators (Honig, Boneau, Bur- 
stein, & Pennypacker, 1963; Jenkins & Harrison, 1962; 
Schwartzbaum & Kellicut, 1962) independently devel- 
oped interdimensional procedures to measure gen- 
eralization of extinction independently of generaliza- 
tion of reinforcement. For example, Honig et al. ran 
two similar studies in which interdimensional dis- 
crimination training was given to one group of 
pigeons with S+ as a homogeneous white key and S— 
as a black vertical line bisecting the white key. For 
the other group, S+ was a vertical line on the key and 
S~ was the white key. After differential responding 
was well established, stimulus generalization was mea- 
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sured by presenting the line at various angles with the 
Guttman and Kalish technique. 

If the effects of extinction generalize, S~ should 
show the fewest responses and the number of re- 
sponses to a test stimulus should increase with increas- 
ing distance from S—. For the groups with the line 
correlated with S~, Figure 6 shows that an inhibitory 
gradient was obtained around S—. With S+ correlated 
with reinforcement, an excitatory or decremental 
gradient was obtained. This experiment is one of the 
clearest demonstrations of parallel generalization for 
extinction and reinforcement. 

The major advantage of interdimensional training 
for the measurement of inhibitory dimensional con- 
trol is that the stimuli can be varied along the S- 
dimension without interacting with 5+. Excitatory 
gradients around St and inhibitory gradients around 
S~ can be compared with this procedure. Notice in 
Figure 6 that more responses were obtained during 
the generalization test when the oricntation of the 
line was employed as the 5+ dimension. The major 
weakness of the extinction procedure is that when 
very few responses are obtained during the géiieraliza- 
tion test along the S— dimension, a “fleer effect” 
makes detection of inhibition extremely difficult. 


“——® Line positive 
S——S Line negatives 
a4 Line positive 
«——e Line negative 


500 Study 1 { 


Study 2 { 


Mean total responses 


30 60 90 120 150 180 No line 
OD QO 8 8 O 
Degrees of tilt 


Fig. 6. The functions with triangles demonstrate excitatory or 
decremental gradients of stimulus control and were obtained 
following interdimensional training between a vertical line (S*) 
and no line (S-). The functions with circles demonstrate in- 
hibitory or incremental gradients of stimulus control and were 
obtained following interdimensional training between a vertical 
line (S-) and no line (S*). (From Honig, Boneau, Burstein, & 
Pennypacker, 1963.) 
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RESISTANCE TO REINFORCEMENT 


Hearst, Besley, and Farthing (1970) adapted the 
method of retardation of the development of a condi- 
tioned response from Rescorla’s (1969a) analysis of 
inhibition in classical conditioning. This procedure 
substantially increased the reliability of the inhibitory 
gradient. Rather than measuring the resistance to 
extinction on the S— dimension, as in the extinction 
methods, the resistance to reinforcement on the S— 
dimension is obtained. First, behavior is extinguished 
to S~ during interdimensional discrimination train- 
ing. Then, during the generalization test, responses to 
all of the test stimuli are reinforced on a VI schedule. 
The procedure is repeated for several successive daily 
sessions. The procedure assumes that the resistance of 
a response to the effects of reinforcement, in the 
presence of a former S—, provides an index of the in- 
hibitory properties of that stimulus. For example, an 
animal will acquire a response to a novel stimulus 
more rapidly than to a stimulus to which responding 
has been previously extinguished. The advantage of 
the resistance-to-reinforcement method is that it clim- 
inates the problem of the zero base line which occurs 
when no responses are obtained during the general- 
zation test. A limitation of the procedure is that equal 
rates of reinforcement are required in the presence of 
each test stimulus. Otherwise, differences in response 
rate during testing could be attributed to differential 
reinforcement. 

Rilling, Caplan, Howard, and Brown (1975) dem- 
onstrated the effectiveness of the resistance-to-rein- 
forcement procedure in elevating responding to the 
S— dimension following errorless learning in which 
the rate of responding to S~ was essentially zero 
throughout discrimination training, The results of this 
procedure for 15 successive days of generalization test- 
ing are illustrated in Figure 7. For both birds, the 
slope of the inhibitory gradient remained essentially 
unchanged during many sessions. While individual 
variability was observed in the shape of the gradients 
during the early sessions, the inhibitory gradients 
showed no tendency to invert and became excitatory 
with extended training as reported by Hearst, Besley, 
and Farthing (1970). Therefore, the resistance-to- 
reinforcement procedure reliably measures inhibitory 
stimulus control. 


COMBINED CUES OR SUMMATION 


Combined-cue tests are also designed to elevate 
response output to the S~ dimension during the gen- 
eralization test. Many investigators—e.g., Rescorla 
(1969a)—regard summation as the most direct method 
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Fig. 7. Rate of responding to each test stimulus during each of 
the 15 days of generalization testing with the resistance to rein- 
forcement procedure for birds 4589 and 5199, (From Rilling, 
Caplan, Howard, & Brown, 1975. © 1975 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


of measuring inhibition, especially when a classical 
conditioning paradigm is employed. In this method, 
S+ is presented simultaneously with test stimuli from 
the S— dimension. The assumption is that if the test 
stimulus has inhibitory properties, its presentation 
should produce a decrement in the presence of a 
stimulus associated with reinforced responding. In a 
study by Lyons (1969), using a combined-cues tech- 
nique, S+ was a monochromatic light of 550 nm and 
S~ was a white vertical line on a black background. 
During a stimulus-generalization test, the angle of the 
line was varied, but each test stimulus from the S- 
dimension was superimposed upon S*+. Surprisingly, 
excitatory generalization gradients were obtained 
with the peaks at S~. In a similar experiment using a 
combined-cues test in which lines at various angles 
were superimposed upon $+, Davis (1971) obtained 
evidence for the inhibitory property of the line, since 
all such compound stimuli produced lower rates of 
responding than S+ alone. However, for some birds 
maximum responding occurred when S~- was com- 
bined with St+. These data are anomalous, since S— 
was an inhibitory stimulus, while a decremental 
gradient was obtained around S—. 

However, Drexler and Terrace’s unpublished data 
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suggest that Lyons’s and Davis’s failure to obtain in- 
hibitory stimulus control may have been due to a 
high rate of responding to S~ prior to the stimulus- 
generalization test. ‘These studies indicate that ade- 
quate interpretation of stimulus-generalization gradi- 
ents requires data showing the rates of responding to 
S+ and S~- during the acquisition of the discrimina- 
tion prior to the generalization test. 

At present, the combined-cues method is the least 
satisfactory procedure for measuring inhibitory di- 
mensional control. Superimposing 5+ upon S— may 
not produce the desired increase in responding in the 
presence of S~ during the generalization test, as 
Yarczower (1970) noted. The presence of S+ may 
overshadow stimuli from the S— dimension and _ pro- 
duce a flat gradient. Furthermore, the measurement 
of inhibition is dependent upon the strength of rein- 
forced responding, so that the detection of inhibition 
is more likely when the reinforced behavior is weak 
and susceptible to disruption than when the rein- 
forced behavior is strong and resistant to disruption. 
Consistent with this view, Yarczower and Evans (1974) 
found that an increase in the amount of training was 
accompanied by a reduction in the amount of ex- 
ternal inhibition to a novel stimulus. 


STIMULUS-REDUCTION OR 
ADVANCE PROCEDURE 


The fourth technique for measuring inhibitory 
stimulus control is stimulus reduction. Several in- 
vestigators (Rilling, Askew, Ahlskog, & Kramer, 1969; 
Rilling, Kramer, & Richards, 1973; Terrace, 1971) 
have demonstrated that pigeons acquire a response 
that terminates and thereby reduces the duration of 
the stimulus associated with extinction. These data 
indicate that a stimulus asseciated with the absence 
of reinforcement May become a conditioned aversive 
stimulus. As Honig, Beale, Seraganian, Lander, and 
Muir (1972) point out. none of the current definitions 
of inhibition deals with the possibility that it may be 
defined by a reduction in the duration of 5—. Rather, 
the definitions concentrate upon a reduced rate of re- 
sponding in the presence of a stimulus as the criterion 
for inhibition. Unfortunately, this had led most in- 
vestigators to neglect duration as a parameter of an 
inhibitory stimulus. 


SELECTING A METHOD FOR ASSESSING 
INHIBITORY STIMULUS CONTROL 


Rescorla (1969a) and Hearst (1972) argue that the 
“most direct” method is the best, and they lean to- 
ward the combined-cues test as the method of choice. 
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The problem here is that no criterion is employed for 
ranking procedures on a scale of “directness.”’ The 
combined-cues test is probably employed more fre- 
quently because of its historical precedent in Pavlov’s 
work. An empirical criterion should be employed in 
selecting a procedure for measuring inhibitory stim- 
ulus control. The most sensitive procedure is usu- 
ally the best choice. Hearst’s (1972) point that ex- 
perimenters should employ a variety of methods and 
compare the results of different procedures is worthy 
of further emphasis. It seems likely that the results of 
such research will demonstrate that the different mea- 
sures of inhibitory stimulus control are not always 
highly correlated. 


Spence’s Theory of Discrimination Learning 


ASSUMPTIONS AND 
QUALITATIVE PREDIGTIONS 


Spence’s (1937) analysis of what is now called in- 
tradimensional learning has bécomeé a classic. Al- 
though this theory evolved before the development of 
operant techniques, it provides the best explanation 
for the effects of intradimeénsional training on thé 
postdiscrimination gradient. The theory includes the 
following five assumptions: 


1, Reinforcement of responding to a stimulus (5+) 
produces an excitatory tendency to respond to Sr. 

9. Excitation généralizes around S+. 

3. Extinction of responding to a stimulus (S—) pro= 
duces an Inhibitory tendency opposite to the ten- 
dency associated with &*. 

4. Inhibition generalizes around S~. 

5. The predicted response to any tect ctimulue if 
obtained by subtracting the amount éf inhibition 


te the stimulus frem ths ameunt of sacitation te 
the stimulus. 


Spence developed his theory to account for trans- 
position, which is observed in a simultaneous discrim- 
anation when the subject prefers to $+ a novel stim- 
ulus which is displaced from $+ in a direction away 
from S~. Riley (1968) has provided a thorough review 
of the theories and research on the transposition prob- 
lem. However, the transposition experiments did not 
provide a crucial test of Spence’s theory, since the 
hypothetical gradients of excitation and inhibition 
were never directly measured. 

Spence’s theory is easily extended to successive dis- 
crimination experiments. Following intradimensional 
training, the postdiscrimination gradient is the re- 
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sultant of the interaction between excitation and in- 
hibition. Therefore, the number of responses emitted 
to each test stimulus during a generalization test is 
obtained by subtracting the amount of generalized 
inhibition to S~ from the amount of generalized ex- 
citation to S+, 

Spence’s algebraic summation or gradient-interac- 
tion theory produces the following major predictions 
about the shape of the postdiscrimination gradient 
following intradimensional discrimination training: 


l. The maximum or peak of the generalization oradi- 
ent Secure at 4 test stimulus which is displaced from 
S* in a direction away from 5~. This is the posi- 
tive peak shift. 

2. The minimum of the generalization gradient occurs 
at a test stimulus which is displaccd from S~ in a 
direction away from St. This is the negative peak 
shift. 

3. The magnitude ef the peak shift increases as the 
difference between 5+ and S- is reduced. The 
peak shift is not obtained with a jarge difference 
between St and S8~, since there is no overlap or 
interaction between the excitatory and inhibitory 
eracients. 


4. The peak shift does not occur if the inhibitory 
gradient is flat or horizontal. Subtracting a constant 
from the excitatory gradient yields a predicted 


gradient with the peak at $+, 


Rate of responding to $+ is reduced by discrimina- 
tion training relative to the single-stimulus base 
line. Therefore, the number of responses to each 
stimulus in the postdiserimination eradient should 
be less than the number of résponses in the excita- 
tory gradient obtained following single-stimulus 
training. 


Qe 


MATHEMATICAL GONSTRAINTS ON 
SPENCE’S THEORY 


When Spence (1937) first proposed his theory, he 
was forced to speculate about the theoretical shape of 
the excitatory and inhibitory gradients. This was a 
weakness which he readily acknowledged. His ap- 
proach was essentially intuitive, illustrated with 
graphs of convex hypothetical generalization curves 
which have been reproduced in most learning texts. 
Critics have pointed out that not all excitatory and 
inhibitory functions generate the predictions de- 
scribed above. For example, Hull (1943) noted that 
exponential or concave gradients fail to predict the 
peak shift. In addition, Hebert and Krantz (1965) and 
D. Blough (1969) have shown that linear or tent- 
shaped gradients also fail to predict peak shifts. 

What relationship between the excitatory and in- 
hibitory gradients is necessary and sufficient for a 
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Fig. 8. Post hoe prediction of the postdiscrimination gradient 
using Spence’s algebraic summation theory. The empirical post- 
discrimination gradient was obtained by Hanson (3959) with S+ 
at 550 and S- at 670 nm The empirical gradient is obtained 
by subtracting the hypothetical inhibitory gradient from the 
hypothetical excitatory gradient. The hypothetical functions 
were selected to yield the positive peak shift, in which the peak 
of the empirical gradient was displaced from 550 nm to 540 
nm and the nepative peak shift in which the minimum was 


displaced from 570 to 580 nm. (Original figure prepared by 
Marty Kl¢in,) 


quantitative prediction of the positive and negative 
peak shifts? ‘To answer this consider Figure 8, which 
presents one of Hanson’s (1959) empirical postdis- 
crimination gradients from Figure 4 of this chapter, 
which shows both positive and negative peak shifts. 
Hypothetical gradients of excitation and inhibition 
which sum algebraically to yield Hanson’s data were 
devised post hoc. These are also presented in Figure 8. 
The hypothetical gradient of excitation has a maxi- 
mum at 5*, and the hypothetical gradient of inhibi- 
tion has a minimum at S-. 

On the empirical postdiscrimination gradient, max- 
imal responding occurred to S1, a stimulus displaced 
from S+ in a direction away from S~. Minimal re- 
sponding occurred to S2, a stimulus displaced from $— 
in a direction away from S+. It can be proven mathe- 
matically’ that in order to obtain a positive peak 
shift from S+ to SI, the slope of the inhibitory gra- 
dient between S+ and S1 must be steeper than the 
slope of the excitatory gradient between $+ and SI. 


1] thank Marty Klein for developing this proof. 
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Similarly, a necessary and sufficient condition for the 
negative peak shift from S~ to S2 is that the slope of 
the excitatory gradient between S~ and S82 be steeper 
than the slope of the inhibitory gradient between S— 
and S2. 

The next problem is the selection of a family of 
curves which best fit the excitatory and inhibitory 
gradients obtained after interdimensional training. 
Since theorists (e.g., Rescorla, 1969a; Spence, 1936, 
1937) define excitation and inhibition as opposite be- 
havioral processes, excitatory and inhibitory gradients 
from the same family are most appropriate. Gaussian 
discriminal distributions, of which the bell-shaped 
normal distribution curve is the most familiar 
example, are the best mathematical model for describ- 
ing the shape of the excitatory and inhibitory gra- 
dients obtained after interdimensional training. (See 
Blough, 1969, and Nunally, 1967, for further discus- 
sion of these distributions.) Blough (1967, 1969) ob- 
tained bell-shaped gradients by presenting several test 
stimuli which were spaced closely to the training 
stimulus. Sharp peaks in published gradients may just 
reflect a lack of data points in the vicinity of S*. 

In order to predict the positive and negative peak 
shifts simultaneously, and still have some positive 
values in the postdiscrimination gradient, cither the 
entire excitatory gradient must be further from the 
abcissa than the inhibitory gradient or the inhibitory 
gradient must be a flatter bell than the excitatory 
gradient. Empirical evidence (Hearst. 1968, 1969b: 
Honig et al., 1963; Jenkins & Harrison, 1962) suggests 
that inhibitory gradients are indeed flatter than 
excitatory gradients. Jenkins (1965, pp. 58-59) implies 
that flatter inhibitory gradients must be the case when 
stimulus control along the S— dimension is measured 
by the number of responses (such as key pecking) 
which are reinforced during $+. There are two sub- 
classes of responses, other than key pecking, which 
must be considered: (1) incompatible responses, such 
as turning away from the key, which are presumably 
conditioned during S~, and (2) all other responses. In 
plotting the inhibitory gradient, only a decrease in the 
subclass of incompatible responses which appears as 
an increase in the response that is reinforced in S+ 
contributes to the slope of the inhibitory gradient. An 
increase in all other responses lowers the overall level 
of responding and flattens the inhibitory gradient. 


QUANTITATIVE TESTS OF 
SPENCE’S ‘THEORY 


Appetitive base lines. The hypothetical nature of 
the excitatory and inhibitory gradients is a major 
weakness of Spence’s theory. The development of 
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interdimensional training techniques has removed this 
ambiguity, since the shape of the excitatory and in- 
hibitory gradients can be empirically determined. The 
methodology for the prediction of the peak shift from 
eradients of excitation and inhibition has been devel- 
oped by Hearst (1968, 1969b). 

The basic design involves three groups which are 
given successive discrimination training. The pre- 
excitation group is given interdimensional training 
with an St selected from the dimension of generaliza- 
tion and an orthogonal S—. Discrimination training is 
followed by a generalization test which produces an 
empirical excitatory gradient. The  preinhibition 
group is given interdimensional training with an S— 
selected from the dimension of generalization and an 
orthogonal S+. The generalization test produces an 
empirical inhibitery gradient. The third greup is 
given intradimensional training with the St of the 
preexcitation group and the S~ of the preinhibition 
group. A generalization pradient is also obtained alter 
intradimensional training, 

Hearst’s analysis does not assume that the gradient 
obtained after intradimensional training has the same 
form as those obtained after interdimensional train- 
ing. In fact, intradimensional gradicnts are often 
asymmetric, while interdimensional gradients are 
often symmetrical. The analysis simply predicts the 
postdiscrimination gradiént for the intradiménsional 
group by subtracting the inhibitery gradient frem the 
excitatory gradient. 

One problem is that the absolute gradients vary 
greatly in the mean total number ef responses to the 
test stimuli due to the occurrence of behavioral contrast 
during the acquisition of the discrimination. When 
the groups were equated in Hearst’s study (1968) by 
converting the absolute gradients to relative pradients, 
a reasonably good ft was obtained between a pre- 
dicted and the obtained postdiscrimination gradient 
by subtracting the relative inhibitory gradient from 
the relative excitatory gradient. Unfortunately, the 
analysis was weakened by Hearst's failure to obtain 
a peak shift for the group that received intradimen- 
sional training. 

Hearst (1968, 1969b) employed the dimension of 
the angle of the line in his experiments. Using the 
design developed by Hearst, Marsh (1972) employed 
the wavelength dimension to see if the subtraction of 
an empirical inhibitory gradient from an excitatory 
gradient produced a predicted gradient including the 
peak shift which corresponded to the gradient ob- 
tained after intradimensional training. Marsh’s pre- 
dicted postdiscrimination gradient displayed a peak 
shift and showed a rough correspondence to the actual 
postdiscrimination gradient. Thus the data derived 
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from an appetitive base line provide some support for 
the gradient-interaction theory proposed by Spence. 


Aversive base lines. In contrast with the extensive 
literature on the influence of environmental stimuli 
on behavior maintained by schedules of positive rein- 
forcement, very little is known about the influence of 
environmental stimuli on behavior maintained by 
schedules of negative reinforcement, although Sidman 
(1966, p. 494) demonstrated stimulus control of free 
operant avoidance behavior. The earlier experiments 
in the stimulus control of behavior maintained by 
aversive stimuli were analyzed by Hearst (1969a) in a 
pioneering paper which concluded that similar laws 
of generalization applied to positive and negative 
reinforcement. Hearst found that the parameters of 
the base line rather than the type of behavioral base 
line determined the shape of the stimulus-generaliza- 
tion gradient. 

Basing their experiment upon Hearst’s design, 
Klein and Rilling (1972) investigated the prediction 
of gradients following intradimensional training with 
an aversive base line, Pigeons were trained to press a 
treadle on a shock-postponement schedule in which 
brief 4-mA shocks followed one another at 5-sec inter- 
vals (S-S interval) unless the treadle was pressed. 
After a treadle press, the next shock occurred after 20 
sec (R-S interval). After avoidance responding stabil- 
ized, auditory discriminative stimuli were introduced. 
The positive stimulus (St) was associated with the 
avoidance schedule, while the negative stimulus (S—) 
was associated with extinction of avoidance without 
shocks. For the excitatory group, St+ was a 1,000-Hz 
tone and S~ was noise; for the inhibitory group, St 
was noise and S~ was a 1,500-Hz tone: and for the 
intradimensional group, $+ was a 1,000-Hz tone and 
S— was a 1,500-Hz tone. After reaching a criterion for 
the acquisition of the discrimination, each group was 
given an identical stimulus-generalization test along 
the frequency dimension. 

I'wo types of generalization tests were employed: 
resistance to extinction, in which there were no sched- 
uled shocks; and resistance to (negative) reinforce- 
ment, in which one avoidable shock occurred 5 sec 
after the beginning of each test tone presentation if 
the animal failed to respond during the first 5 sec. 
The  resistance-to-negative-reinforcement procedure 
was developed as an analog to the procedure devel- 
oped by Hearst et al., for positive reinforcement, in 
order to elevate the rate of responding along the S— 
dimension. ‘This technique is also similar to Hoff- 
man’s (1966, pp. 516-517) use of noncontingent shocks 
between stimulus presentations which increased stim- 
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ulus control in a conditioned suppression paradigm. 

The relative gradients for the three groups are pre- 
sented in Figure 9. ‘The left panel shows the gradients 
obtained with shock, and the right panel shows the 
gradients obtained without shock. For Group I, the 
excitatory group, in which S+ was 1,000 Hz and S- 
was noise, the peak of the gradients occurred at S*. 
For Group II, the inhibitory group, in which S+ was 
noise and S~ was 1,500 Hz, an inhibitory gradient 
with a minimum at S— was obtained. For Group III, 
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Fig. 9. The upper panel presents the relative excitatory gradi- 
ents obtained for Group I. The center panel presents the relative 
inhibitory gradients obtained for Group II. The lower panel 
presents the relative postdiscrimination gradients. The gradients 
on the left are the results of tests conducted with avoidable 
shocks. In these gradients, the solid-lined gradients (WSE) 
include shock-elicited responses, while the broken-lined gradi- 
ents (W/OSE) exclude shock-elicited responses. The gradients 
on the right are from tests conducted without scheduled shocks. 
‘Tones are spaced at approximately equal intervals along a log 
scale. Note that the ordinate scale for the Group I no-shock 
gradient (upper right) differs from the other ordinate scales. 
(From Klein and Rilling, 1974. © 1974 by the Society for the 
Experimental Analysis of Behavior, Inc.) 
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the intradimensional condition, in which S* was 
1,000 Hz and S~ was 1,500 Hz, an asymmetric gradient 
with a maximum at St and a minimum at S~ was 
obtained. Only one of the four birds in the intra- 
dimensional group showed a peak shift. When pecking 
was reinforced with food, Jenkins and Harrison (1962) 
obtained a peak shift with an S+ of 1,000 Hz and an 
S~ of 950 Hz. Possibly the difference between S+ and 
S~ in the Klein and Rilling experiment may have 
been too large to obtain a peak shift. 

The left panel of Figure 10 illustrates how the 
inhibitory gradient was subtracted from the excitatory 
eradient to yield a derived postdiscrimination gra- 
dient (PDG). The right panel of Figure 10 shows that 
reasonable agreement was obtained between the pre- 
dicted PDG I and the empirical relative gradient ob- 
tained from Group III, since four of the seven points 
matched almost exactly, The prediction is improved 
further by converting to prepertions in PDG 2, 

Figure 9 shows that the relative excitatory gradient 
was much steeper for the tests without shock than for 
the tests with shock. Actually, very few responses to 
each test stimulus were obtained for the excitatory 
eradients without shock. The predictions in Figure 10 
would have been much worse had the data for the 
tests without shock been employed, but the tests with 
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Fig. 10. (Left) Relative gradients of excitation (closed triangles) 
and inhibition (squares). The excitatory gradient is the WSE 
gradient of Figure 9 (upper left). The inhibitory gradient is 
calculated from the WSE gradient of Figure 9 (center left). The 
numbers in the square brackets alongside the vertical lines 
between the gradients represent the algebraic sum of the two 
points that the particular line connects. The numbers in paren- 
theses along the same vertical lines were obtained by trans- 
forming the numbers in square brackets to a scale of 100. 
(Right) Empirical (closed-circles) postintradimensional discrimi- 
nation gradient (PDG) of relative generalization from Figure 9 
(WSE), compared with PDGs derived from the calculations on 
the left. Ihe values on derived PDG 1 are from the square 
brackets on the left, while the values on derived PDG 2 are 
from the parentheses on the left. (From Klein & Rilling, 1974. © 
1974 by the Society for the Experimental Analysis of Behavior, 
Inc.) 
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shock probably provide a more accurate picture of the 
generalization process. 

In general, the results of the experiments of Hearst, 
Marsh, and Klein and Rilling support Spence’s gra- 
dient-interaction theory and suggest that the deter- 
minants of generalization on appetitive and aversive 
base lines are simular. 


Dynamic Models of Stimulus Control 


While Spence’s theory integrates much of the data 
on stimulus control, it is inadequate in accounting for 
a growing number of phenomena, In some cases, re- 
sponding is found to be enhanced in the presence of 
a stimulus near an S~, and responding is diminished 
in the presence of a stimulus near an St. Several 
dynamic models have been developed (D. Blough, 
1975; Rescorla & Wagner, 1972; Wagner & Rescorla, 
1972) which incorperate such phenemena that are net 
easily handled by Spence’s theory. The basic assump- 
tions of their model are stated by Rescorla and Wag: 
ner (1972) as follows: 


The effect of a remforcement or nenreinferce- 
ment in changing the asseciative strength of a 
stimulus depends upon the existing associative 
strength, not only of chat stimulus, but also of 
other stimuli concurrently present. It appears 
that the changes in associative strength of a 
stimulus as a result of a trial can be well-pre: 
dicted from the composite strength resulting 
from all stimuli present on that trial. If this 
composite strength ig low, the ability of a rein- 
forcement to produce increments in the strength 
ot component stimuli will be high; if the com- 
posite strength is high, reinforcement will be 
relatively less effective. Similar generalizations 
appear to govern the effectiveness of a nonrein- 
forced stimulus presentation. If the composite 
associative strength of a stimulus compound 16 
high, then the degree to which a nonreinforced 
presentation will produce decrements in the 
associative strength of the components will be 
large; if the composite strength is low, the effect 
of a nonreinforcement will be reduced. (p. 73) 


Two experiments by D. Blough (1975) illustrate 
the utility of a dynamic model of stimulus control. In 
these experiments the rate of responding to stimuli 
along a continuum is observed. Associative strength is 
manipulated by associating a moderate rate of rein- 
forcement with each stimulus on the continuum 
which produces an above-zero base line. A decre- 
mental gradient is obtained by increasing the fre- 
quency of reinforcement in the presence of one stim- 
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Fig. 11. (Left) A family of excitatory gradients for one bird classified by breaking down 
the 20-sec interval into four 5-sec periods. The gradients become flatter as the end of 
the interval approaches. (Right) A family of inhibitory gradients for one bird classified 
within the FI 20-sec base line. Note the “shoulders” to the left and right of 597 for the 
0-5-sec functions and the apparent ceiling effect in curves collected during the last two 
quarters of the trial interval. The excitatory gradients were obtained by increasing the 
probability of reinforcement in the presence of 597 relative to the other test stimuli, and 
the inhibitory gradients were obtained by decreasing the probability of reinforcement at 
597 nm. (From Blough, 1975. © 1975 by the American Psychological Association. Re- 


printed by permission.) 


ulus, while an incremental gradient is obtained by 
decreasing the frequency of reinforcement, 

The base line was an FI 20-sec schedule and re- 
sponses were recorded within 5-sec intervals: 0-5, 5- 
10, and 15-20 sec from the onset of the fixed interval. 
The probability of food at the end of the interval was 
.l. The moderate rate of reinforcement for all of the 
stimuli assured an above-zero base line prior to gen- 
cralization testing. The most important aspect of this 
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Fig. 12. Mean relative rate of responding for three birds. Data 
were obtained during discrimination training in which rein- 
forcement was four times as frequent in the conditioned rein- 
forcer that ended long-wavelength trials as it was in the con- 
ditioned reinforcer that ended short-wavelength trials. Note the 
trough and peak to the left and right of the dividing line 
between low and high reinforcement. (From Blough, 1975. © 
1975 by the American Psychological Association. Reprinted by 
permission.) 


procedure is that an excitatory gradient was obtained 
by increasing the probability of reinforcement during 
597 nm, St; and following a return to the base line, 
an inhibitory gradient was obtained by decreasing the 
probability of reinforcement to 597 nm, now an S-. 
These results are presented in Figure 11. Responses 
are shown s¢parately for each of the four 5-sec periods 
within the 20-sec trial interval. The FI schedule was 
selected to avoid a “ceiling effect” in which the rate 
of responding is so high that it is net influenced by 
the value of the test stimulus. Generalization was 
measured each day with a maintained procedure in 
which several test trials were included within each 
session. 

In Figure 12 the probability of reinforcement was 
four times higher for the long-wavelength trials than 
it was for the reinforcer that ended short-wavelength 
trials. The rate of responding immediately to the 
right of the boundary between reinforcement prob- 
abilities is accentuated relative to the other stimuli 
receiving the same probability of reinforcement, while 
the rate of responding to the left of the boundary is 
attenuated relative to the other stimuli receiving the 
same probability of reinforcement. Thus the boun- 
dary between reinforcement conditions for multiple 
stimuli on a continuum is an important determinant 
of inhibitory stimulus control. These data confirm 
experiments of Catania and Gill (1964) and Farthing 
(1974). Blough has developed a dynamic mathematical 
model, similar to that of Rescorla and Wagner (1972), 
that provides an excellent fit to these data. This 


Mark Rilling 


phenomenon is also analogous to the Mach bands ob- 
tained in visual perception which are produced by 
differences in luminance between adjacent regions of 
the stimulus. 

In Blough’s experiments, each stimulus presenta- 
tion was followed by an interval of 5-12 sec during 
which any responses on the dark key were ex- 
tinguished. Weiss’s research on stimulus compounding 
suggests that what is learned when the discriminative 
stimuli are “‘off” is an important determinant of stim- 
ulus control. Therefore, perhaps Blough’s results 
would have been different without the dark key. 

One of the controversial issues in stimulus control, 
as Hearst et al. (1970) point out, has been the search 
for a neutral zone between excitation and inhibition. 
They point out that an incremental gradient around 
S- does not indicate whether that stimulus is neutral 
or inhibitory. It could be argued that a concept of 
inhibition is not necessary to explain Blough’s clata, 
since the behavior could also be explained in terms of 
reduced excitation, It would be useful te extend 
Blough’s procedure to interdimensional training to 
determine if the combination of S~ with St produces 
a decrement in responding. Such a procedure would 
establish that S~ is an inhibitory stimulus. 

Blough’s data suggest that inhibition occurs rela- 
tive to a base line probability of reinforcement in the 
presence of the background stimuli. There is, there- 
fore, no point of absolute neutrality, only increments 
or decrements from the base line of nondifferential 
reinforcement, Stimuli associated with reduced rates 
of reinforcement demonstrate some of the same in- 
hibitory phenomena as do stimuli associated with 
extinction. An important empirical question is the 
extent to which similar functional relationships are 
obtained between these two conditions. 


DETERMINANTS OF THE PEAK SHIFT 
AND INHIBITORY STIMULUS CONTROL 


Historically, the peak shift was investigated betore 
interdimensional training led to the discovery of the 
inhibitory stimulus-generalization gradient. Conse- 
quently, much more is known about the determinants 
of the former than of the latter. Hearst’s three-group 
design provides the most promising framework for a 
quantitative analysis of the relationship between inter- 
dimensional and intradimensional training. Unfor- 
tunately, experiments which compare the two types 
of discrimination training under equivalent condi- 
tions are rare. However, the available data (Hearst, 
1969b; Klein & Rilling, 1974; Marsh, 1972) suggest 
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that the determinants of the peak shift and the inhib- 
itory gradient are identical. 

Since Klein and Rilling obtained inhibitory gra- 
dients on an avoidance base line, it appears that the 
peak shift and inhibitory control do not depend on 
whether responding is maintained by an appetitive or 
an aversive base line. 

In the sections which follow, a unified theory of the 
peak shift and inhibitory dimensional control will be 
presented. Consideration will be given to those vari- 
ables which fail to produce the peak shift and inhib- 
itory dimensional control. This analysis sheds some 
light on the necessary and sufficient conditions for 
these effects. 


Améunt 6f Training 


Most theories of discrimination learning, including 
those of Spence (1936) and Hull (1943), predict that 
the excitatory and inhibitory gradients becomé stééper 
as a function of the amount of discrimination train- 
ing. If the determinants of the peak shift and inhib: 
itory stimulus control are the same, the probability of 
obtaining the peak shift should also incréase as a func- 
tion of the amount of training, Empirical investiga- 
tion of the influence of the amount of training, de- 
fined by the number of sessions, indicates that this 
variable is indeed one of the most important deter- 
minants of the peak shift and inhibitory stimulus 
control. It is not cleay at present, though, which 
aspects of this yariable—c.g,, number of reinforce- 
ments, duration of exposure to the discriminative 
stimuli, number of alternations between 5+ and S-, 
etc.—are critical in determining the slope of the gen- 
eralization pradient. 


INTRADIMENSIONAL 'TRALNING 


‘Thomas ( 1962) observed thea acquisition of the aren 
peak shilt by administering a short generalization test 
to each subject following every even-numbered cession 
of discrimination training. After only 90 min of ca- 
posure to S~, a reliable shift in the area of the gra- 
dient was observed, sometimes cycn before the rates of 
responding to St and S~ began to separate. The mag- 
nitude of the area shift increased as a negatively 
accelerated function of the amount of training. 

Once the effects of short amounts of training were 
determined, investigators turned to the effects on the 
peak shift of training extended over many sessions. 
Terrace (1966a) administered generalization tests fol- 
lowing the 15th, 30th, 45th, and 60th sessions of dis- 
crimination training between two wavelength stimuli 
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with: 580 nm (S+) and 506 nm (S~). The peak of the 
first gradient occurred at 600 nm, a peak shift of 20 
nm. During the next two generalization tests, the 
magnitude of the peak shift decreased to 10 nm, and 
by the last generalization test, the peak shift had dis- 
appeared and the peak reverted back to S+. Positive 
behavioral contrast also decreased with extended 
training. In positive behavioral contrast, the response 
rate in the constant component increases when the 
response rate in the variable component is decreased 
by extinction. These data led Terrace to the conclu- 
sion that determinants of behavioral contrast and of 
the peak shift are identical. 

Data obtained in subsequent experiments disprove 
Terrace’s interpretation. Dukhayyil and Lyons (1973) 
determined the effects of 105 days of intradimensional 
training on behavioral contrast and the peak shift. As 
in ‘Terrace’s experiment, generalization tests were ad- 
ministered at regular intervals. While no bird showed 
a peak shift on each test, a majority of the birds 
showed a peak shift after 105 days of training. Sub- 
stantial fluctuations in the rate of responding to $+ 
were obtained. ‘Therefore on a day-by-day basis, the 
occurrence of the peak shift is not correlated with the 
existence of contrast (1.¢., with response rate in $+). 
Both behavioral contrast and the peak shift are ob- 
tained after extended discrimination training. 


INTERDIMENSIONAL TRAINING 


In an independent groups design, Hearst and 
Koresko (1968) administered a generalization test fol- 
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lowing different amounts of training in which re- 
sponding in the presence of S+ was nondifferentially 
reinforced. The shallow excitatory gradient which was 
obtained on the second day of training became pro- 
gressively steeper through 14 days of training. 

In the Hearst and Koresko study, the acquisition of 
the response was confounded with the acquisition of 
dimensional stimulus control, since both processes oc- 
curred simultaneously. ‘These processes were isolated 
by Schadler and ‘Thomas (1972). In order to firmly 
establish the response, pecking the key was first non- 
differentially reinforced during 10 daily 30-min ses- 
sions in the presence of a white key that was ortho- 
gonal to angularity, the dimension of stimulus control. 
Then generalization gradients were obtained after 0, 
5, 10, or 20 min of nondifferential training in the 
presence of a single white vertical line on a black 
background. Dimensional control by angularity was 
acquired rapidly and approached asymptote after 20 
min of nondifferential training. The Schadler and 
Thomas study demonstrated that the acquisition of 
response strength and the acquisition of stimulus con- 
trol over responding are distinct, separable behavioral 
processes. Once the response is acquired, control by a 
stimulus associated with reinforcement may develop 
in a matter of minutes. 

Farthing and Hearst (1968) determined how the 
amount of interdimensional training affected general- 
ization of extinction. After seven sessions of nondiffer- 
ential reinforcement which established the pecking 
response, five groups of pigeons received from 1 to 16 
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inhibition for individual birds 
in five groups that received 
from 1 to 16 days of discrimina- 
tion training prior to the stim- 
ulus-generalization test. The 
probability of obtaining an in- 
hibitory gradient within each 
group increases with the amount 
of training. The numbers in 
parentheses for each bird in-. 
dicate, on the left, the total 
responses to the six-line test 
stimuli, and, on the right, the 
number of test responses to $+ 
(no line). (From Farthing & 
Hearst, 1968. © 1968 by the So- 
ciety for the Experimental 
Analysis of Behavior, Inc.) 
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days of discrimination training. The St+ was a plain 
white field and S~ was a white field bisected by a 
black vertical line. Following discrimination training, 
each animal was given a generalization test in extinc- 
tion, during which the angle of the line was varied. 
Figure 13 shows the relative gradient obtained from 
each bird. At least one steep inhibitory gradient was 
obtained at each condition of days of training. ‘The 
basic effect of the independent variable was to in- 
crease over days the probability of obtaining an in- 
hibitory gradient from each session. By Day 8, the 
probability was 1.0. 

Hearst (1971) found that an inhibitory gradient 
obtained after 64 sessions of prolonged discrimination 
training had essentially the same shape as those ob- 
tained after 8 or 16 hours. Selekman (1973) replicated 
Farthing and Hearst’s experiment, adding Qa correc- 
tion procedure in which each response to S~ extendéd 
its duration. He found no relationship between the 
number of sessions and the slope of the inhibitory 
gradient which formed during the first session of dis- 
crimination training. Using stimulus compounding as 
an index of inhibition, Yarczower and Curto (1972) 
found that S~ suppressed positively reinforced re- 
sponding during the first 10 min of discrimination 
training. 

Thus in both the excitatery and inhibitory case, it 
appears that stimulus control is acquired in minutes 
rather than hours. The failure to find a relation be- 
tween the amount of interdimensional training and 
the amount of inhibition appears related to the use 
of conditions that measure inhibition hours or sessions 
after inhibitory stimulus control has reached asymp- 
tote. 


AVERSIVE BASELINES 


Rilling and Budnik (1975) determined the 1in- 
fluence of the amount of discrimination training on 
the acquisition of excitatory and inhibitory stimulus 
control in a treadle-press avoidance paradigm. During 
21 days of interdimensional discrimination training, 
generalization was measured daily with a maintained 
procedure by randomly substituting each of six test 
frequencies for the 1,000-Hz training stimulus. Groups 
1 and 2 were designed to measure the acquisition of 
excitatory stimulus control by associating the 1,000-Hz 
tone (S+) with a schedule of free operant avoidance 
and noise (S~) with extinction. Groups 3 and 4 were 
designed to measure the acquisition of inhibitory 
stimulus control by associating the noise (St) with 
free operant avoidance and the 1,000-Hz tone (S—) 
with extinction. 
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Fig. 14. Acquisition of gradients of excitation and inhibition. 


Groups 1 and 2 shew the development of the excitatory gradient 
around S+, while Groups $ and 4 show the development of the 


inhibitory gradient around S8~. Each gradient was formed by 
adding the total number of responses to each stimulus across 
three days of training. Each data pot is a sum for the three 
birds in each group. The closed cireles show the gradients ob- 
tained without a probe shock, while the open circles show the 
gradients obtained with the addition of an ynayoidable probe 
ehoek during each test stimulus. For Croups I and 8, the probe 


shock was present during sessions 16-21. For Groups 2 and 4. 
the probe shock was presented from the beginning of discrimi- 


nation training. The arrow indicates the 1 ,000-Hz stimulus for 
each group. Note the differant ordinate scales for Croupa 1 and 
2 and for Groups 3 and 4. The frequencies employed were from 
left to right: 300, 450, 670, 1,000, 1.500, 2.500, and 3,400 Hz. 
(From Rilling & Budnik, 1975, @ 1975 by the Society for the 
Experimental Analysis of Behavior, Inc.} 


‘The gradients are presented in Figure 14. For 
Group 1, an increasing number of responses to the 
1,000-Hz tone (designated by an arrow) was observed 
as a function of the number of days of training. This 
increase indicates the birds’ discrimination of the test 
stimuli, which were always presented during extinc- 
tion, from the 1,000-Hz training stimulus, which was 
usually associated with the avoidance schedule. As 
discrimination training progressed, the gradients be- 
came steeper. ‘The addition of the unavoidable shocks 
in Session 16 flattened the generalization gradient by 
elevating responding to the test stimuli more than re- 
sponding to S+. For Group 2, unavoidable probe 
shock elevated responding to the test stimuli in com- 
parison with Group I. The steepening of the excita- 
tory gradient with training was much less pronounced 
than in Group I. 
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For Group 3, the acquisition of the discrimination 
is reflected in a rapid decrease in the number of 
responses to S~ and flat gradients for Sessions 7-15. 
Flat gradients with low responding to the test stimuli 
are difficult to interpret because both neutral and 
inhibitory stimuli produce the same outcome in such 
situations. When the unavoidable probe shocks in- 
creased responding, inhibitory gradients emerged in 
Sessions 16-21. The emergence of the inhibitory gra- 
dients following introduction of the probe shocks re- 
emphasized the equivocal nature of a flat stimulus- 
generalization gradient with a low response output. 
For Group 4, in which the probe shocks were present 
from Day 1, inhibitory gradients were obtained 
throughout discrimination training. 

These results show striking parallels between posi- 
tive and negative reinforcement for the acquisition of 
stimulus control and support Hearst’s views that 
similar laws of generalization apply to positive and 
negative reinforcement. Both the excitatory and in- 
hibitory gradients become steeper during the acquisi- 
tion of the discrimination. 


SUMMARY 


The amount of training has a similar influence on 
the acquisition of stimulus control following inter- 
dimensional and intradimensional discrimination 
training and supports a unified treatment of the two 
types of discrimination training. The acquisition of 
the response has often been confounded with the 
acquisition of stimulus control. When these processes 
are separated by providing nondifferential reinforce- 
ment of the response prior to discrimination training, 
excitatory and inhibitory stimulus control is acquired 
within minutes rather than hours. While more data 
on the minimum amount of training necessary for 
stimulus control would be desirable, the data suggest 
that the probability of obtaining the peak shift and 
inhibitory gradient increases in the course of discrim- 
ination training. Extended training has little in- 
fluence once the necessary and sufficient conditions for 
the inhibitory gradient or peak shift have been 
established. 


Stimulus Sequences and Massed Extinction 


Spence’s theory of discrimination learning does not 
explicitly consider sequential variables, such as fre- 
quency of alternation of S+ and S~. For Spence, 
the relevant variable determining inhibition is the 
amount of exposure to S—. In massed extinction, the 
subject is exposed to a single relatively long: presenta- 
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tion of S— which contrasts with the distributed extinc- 
tion obtained in a successive discrimination when $+ 
and S~ are briefly presented in alternate or random 
sequences. Spence’s theory of discrimination learning 
predicts a peak shift following massed extinction. 

As it turns out, however, the sequential presenta- 
tion of S+ and S~ is essential for the peak shift and 
inhibitory stimulus control, a finding that Spence’s 
theory of discrimination learning fails to predict. 
When S+ and S~ were selected from the wavelength 
dimension, Honig, Thomas, and Guttman (1959) 
found that after 20 or 40 min of massed extinction to 
S~, the peak of the generalization gradient occurred 
at St. A peak shift was obtained following an equiv- 
alent amount of exposure to each stimulus when the 
stimuli were presented alternately. Yarczower and 
Switalski (1969) also obtained no peak shift following 
massed extinction using goldfish as the experimental 
organism. Weisman and Palmer (1969) compared 
massed extinction and an equivalent amount of dis- 
crimination training with interdimensional presenta- 
tions of the discriminative stimuli. The birds receiv- 
ing massed extinction showed a flat gradient along the 
S~ dimension, while an inhibitory gradient with the 
minimum at S~ was obtained following successive dis- 
crimination training. 

The typical successive discrimination is a mixture 
of four possible transitions: $+S+, S-S-, S+S—, and 
S~S+. The results of the above experiments indicate 
that the identical transitions, S+S+ and S—S-, are 
not effective in producing the peak shift and inhib- 
itory stimulus control. In a well-designed and care- 
fully controlled experiment, Ellis (1970) compared the 
effectiveness of the two opposite transitions: $+S— 
and S~S+. One group experienced a single $+S- 
alternation within each of several daily sessions, while 
the other group experienced a single S-S+ alterna- 
tion. Ellis obtained a peak shift for the birds that 
received a transition between massed extinction and 
massed reinforcement, but the peak shift was absent 
for the group that received massed reinforcement fol- 
lowed by massed extinction. These data suggest that 
S~S* transition is the necessary and sufficient condi- 
tion for the peak shift. It would be interesting to 
replicate this experiment with interdimensional train- 
ing to determine if the S—S+ transition is also neces- 
sary and sufficient for inhibitory stimulus control. 

An experiment by Rosen and Terrace (1975) con- 
firmed the absence of the peak shift and inhibitory 
stimulus control following massed extinction in the 
presence of intradimensional and interdimensional 
stimuli. These investigators found that the peak shift 
and inhibitory stimulus control were reinstated by the 
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interpolation for 3 min between massed extinction 
and generalization testing of each of the following 
procedures: (1) the presentation of S+ with rein- 
forced responding; (2) the presentation of S+ without 
reinforced responding; and (3) the presentation of 
food independently of responding in the presence of 
the dark key. 

Yarczower (1974) found that, as compared with a 
group that received a generalization test without any 
extinction sessions, massed extinction sharpened con- 
trol by the S*+ stimulus. Although Yarczower’s results 
are at variance with some of the earlier experiments, 
the data are in agreement with the finding that, al- 
though massed extinction may sharpen an excitatory 
gradient, it reduces the probability of obtaining in- 
hibitory phenomena such as the peak shift and inhib- 
itory stimulus control. 

These experiments demonstrate that extinction of 
responding in the presence of S— is not inevitably 
followed by a peak shift or inhibitory stimulus con- 
trol. The relevant variable seems to be the number of 
transitions from S— to ST. 


Stimulus Context and Memory 


Contextual stemult are stimuli within the experi- 
mental apparatus that remain uncorrelated with the 
contingencies of reinforcement in the experimental 
chambér. Poténtial contextual stimuli are the lamps 
that illuminate the chamber (houselights), the level of 
the background noise, and the walls and floor of the 
chamber. Since thesé stimuli are not usually varied 
systematically during the course of an experiment, 
most investigators have assumed that they acquire no 
control over responding. However, contextual stimuli 
play a significant role in models of the inhibitory 
process (e.6., Waener Se Rescorla, 19772) and 4 theory 
of forgetting as retricval failure (Spear, 1971). There- 
fore, it was inevitable that investigators would re- 
assess the role of these background stimuli in the stim- 
ulus control of behavior. 

Miller (1972) employed an ABA sequence of dis- 
crimination reversals to assess the role of stimulus 
context on stimulus control. Each of the three major 
experimental groups received identical conditions of 
discrimination training and generalization testing. In 
Phase I, the discriminative stimuli were 576 nm (S*+) 
and 555 nm (S~), while during Phase 2 the discrim- 
inative stimuli were reversed. In the third phase, each 
pigeon received a stimulus-generalization test on the 
wavelength dimension. 

The groups differed with respect to the stimulus 
context. For Group C,C,C,, the stimulus C, remained 
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present throughout the three phases. For Group 
C,C,C,, the stimulus context changed from C, to C, 
during Phase 2 and remained constant during the 
stimulus-generalization test. For Group C,C,C,, the 
stimulus context changed from C, to Cy during Phase 
2 and reverted back to C, during generalization test- 
ing. Miller changed the stimuli gradually rather than 
abruptly between phases in order to avoid the loss of 
responding with changes in the contextual stimulli. 
The question of primary interest is whether the peak 
shift was appropriate to the discriminative stimuli of 
Phase 1 or Phase 2. 

The main results of Miller’s experiment are pre- 
sented in Figure 15, For Group C,C,G,, a bimedal 
gradient with maximums at 549 and 559 was obtained. 
The peak shift at 549 is appropriate for the final 
phase of discrimination training, while the peak ot 
559 reflects generalization from the $+ employed dur- 
ing the first phase. For Greup €,GoGo, the peak shift 
at 549 nm is appropriate to the contextual stimuli 
present during Phase 2 and generalization testing. 
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Fig. 15. Dependence of the peak shift upon the discriminative 
and background stimuli present during discrimination training 
and the generalization test. When the background stimuli re- 
mained constant, Group C,C,C,, a_ bimodal gradient with 
maximums at 549 and 559 was obtained. With the presence of 
C, during reversal training and generalization testing, Group 


C,C,C,, a peak shift occurred at 549 nm away from S,. With 
the presence of C, during initial discrimination training and 
generalization testing, Group C,C,C,, a peak shift occurred at 
587 nm away from Si rather than the negative stimulus 
present during reversal training immediately prior to the gen- 


eralization test, S.. (Redrawn from Miller, 1972.) 
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The most dramatic influence of the contextual stimuli 
was obtained for Group C,C,C,, in which the con- 
textual stimuli present during generalization testing 
were associated with Phase | of discrimination train- 
ing. ‘This group yielded a peak shift at 587 nm appro- 
priate to Phase 1. For this group, interference from 
Phase 2 was eliminated by manipulating the con- 
textual stimuli. The contextual stimuli enabled the 
subjects to store the memory for the discriminative 
stimuli of Phase 1 independently of the memory for 
the discriminative stimuli of Phase 2. Although the 
contextual stimuli are presumably neutral since they 
are present during both S+ and S-, they exert power- 
ful control over responding. 

Miller’s research demonstrates that paradigms orig- 
inally developed for research on information process- 
ing in humans are also relevant to the stimulus con- 
trol of behavior in infrahuman organisms and that 
the location of the peak of the stimulus-generaliza- 
tion gradient is a sensitive index of interference. 


Schedules of Reinforcement 


Variables associated with the schedule of reinforce- 
ment during discrimination training influence inhib- 
itory stimulus control and the peak shift to a signif- 
icant degree. For example, most experiments on stim- 
ulus control have employed a multiple schedule in 
which S+ was associated with reinforcement and S- 
with extinction. The alternation of these stimuli in a 
successive discrimination produces successive periods 
of responding at relatively high or low rates. 

Whenever multiple schedules of reinforcement are 
utilized, two potential sources of influence on inhib- 
itory phenomena are observed. One relates to the 
interaction between the relative rates of responding 
emitted to each stimulus. The other derives from the 
incentive contrast produced by the difference between 
the density of reinforcement in the two components. 
The following sections will deal with such questions 
as: (1) How much of a difference between the two 
rates of responding is necessary before inhibitory 
phenomena are produced? (2) Is inhibitory stimulus 
control present when a discrimination is established 
by different rates of reinforcement during the two 
stimuli? (3) If this is so, how much of an incentive 
difference between the reinforcers in the two com- 
ponents is necessary? 


FIXED-INTERVAL SCHEDULES 


Every student of operant behavior is familiar with 
the scalloped pattern of responding produced by the 
fixed-interval (FI) schedule. The rate of the rein- 
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forced response is zero immediately after reinforce- 
ment and accelerates rapidly through the interval 
until the response is reinforced. This pattern is pro- 
duced by the alternation of a period of extinction 
with a period of reinforcement availability. Staddon 
(1969) suggested ‘that a stimulus-generalization test 
conducted by varying the stimulus present throughout 
the interval should produce inhibitory stimulus con- 
trol at the beginning of the interval and excitatory 
stimulus control at the end of the interval. This is 
exactly the result obtained in an elegant experiment 
by Wilkie (1974). 

Responses in the presence of a vertical line (0°) 
were reinforced on FI 3-min for two birds and FI 6- 
min for the third bird. In order to increase stimulus 
control by the line, each reinforcement was followed 
by a blackout of 10 min. Generalization was measured 
by randomly varying the angle of the line during 
different intervals. Each stimulus was present through- 
out the interval, An essential component of Wilkie’s 
experiment was a maintained procedure in which re- 
sponses were reinforced as usual throughout the gen- 
eralization test. 

The generalization gradients in the left-hand panel 
of Figure 16 show that an inhibitory gradient with a 
minimum at the vertical line was obtained from each 
bird during the first third of the interval, while an 
excitatory gradient was obtained during the last third. 
During the middle of the interval, the rate of re- 
sponding was not influenced by the angle of the line. 
These results are dependent upon maintaining re- 
sponding during the generalization test with reinforce- 
ment at the end of the interval. The right-hand panel 
of Figure 16 demonstrates that a family of excitatory 
gradients were obtained when the test was carried out 
during extinction with the omission of reinforcement. 

Wilkie’s results support the theory of temporal 
control developed by Staddon (1974). When food is 
delivered on a schedule with a constant period, Stad- 
don predicts that food acquires inhibitory after- 
effects that decay with the passage of time. There- 
fore, a stimulus associated with the termination of 
food acquires inhibitory stimulus control. 


MULTIPLE SCHEDULES: RESPONSE RATE 
AND INCENTIVE INTERACTIONS 


Under the typical conditions of discrimination 
training, a mult VI extinction schedule produces a 
peak shift following intradimensional training and an 
inhibitory gradient following interdimensional train- 
ing. The data reviewed in this section demonstrate 
that the peak shift and inhibitory stimulus control 
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Fig. 16. Response rate in the presence of the lines at different 
angles during successive thirds of the fixed intervals that 
followed 0° intcrvals ending in reinforcement (Icft-hand pancls) 
or nonreinforcement (right-hand panels). When responding is 
reinforced, an inhibilory gradient is obtained at the beginning 


of the intcrval and an excitatory gradicnt is obtained at the 
end of the interval. Rate is averaged for each of the three birds 
over five test sessions. (From Wilkie, 1974. @ 1974 by the So- 
ciety for the Experimental Analysis of Behavior, Inc.) 


are also obtained under a wide range of conditions in 
which responding is reinforced during gach of the 
two components. Incentive variables are the proper- 
ties of the reinforcer that determine its effectiveness 
in maintaining responding. ‘The delay, frequency of 
occurrence, magnitude, or intensity of a positive or 
negative reinforcer are incentive variables that com- 
bine with the rate of responding to determine the 
probability of the peak shift and the amount of in- 
hibitory stimulus control. The relative incentive 
value of two schedules of reinforcement is operation- 
ally defined by the organism’s degree of preference for 
the stimuli associated with each schedule when they 
are presented simultaneously. Measuring this prefer- 
ence requires an independent condition which un- 
fortunately is rarely employed in experiments on 
stimulus control. 

The analysis was suggested by some work of Weiss 
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(1971, 1974) and Weiss and Van Ost (1974) in the area 
of stimulus compounding. Weiss (1974) formulated a 
model of stimulus control in which response and in- 
centive properties conditioned to tone and light com- 
bine to produce the behavior resulting in T + L. 
Maximum additive summation occurs when T + L is 
composed of stimulus elements associated with an in- 
crease in both response and reinforcement rates. For 
maximum suppressive summation, T+ L is composed 
of stimulus elements associated with a decrease in both. 
When only one factor is operating, response or rein- 
forcement, while the other is constant over multiple- 
schedule components, T+ L should control only 
moderate summation. When response and reinforce: 
ment properties are conflicting, one increasing and 
the other decreasing, minimal or no summation 
Should occur to T+ L. Since summation and peak 
shift are sensitive to many of the same variables, it 
could preve uscful to apply Weiss's formulations to 
predictions of peak shift and inhibitory stimutus con. 
trol. 

Some new terminoloey must be introduced before 
translating from the stimuliscompounding to the 
stimulus-generalization paracdigm, Fer the many sases 
in which a successive discrimination is established by 
reinforcing responding at different rates during each 
of the two components of a multiple schedule, the 
notation of $+ and S- is inadequate. In the coenyen- 
uen followed here, the stimulus asseciated with coen- 
stant conditions of reinforcement during prediscrimi: 
nation and discrimination training is SJ, and the 
stimulus associated with a change in the conditions of 
reinforcement is 32, whether the change is an increase 
or decrease in reinforcement density. Thus the rate 
of responding during $2 may cither increase or de 
crease during the acquisition of the discrimination 
depending upon the nature of the change in rein- 
forcement, In the special multiple schedule in which 
extinction is introduced after a period of nondiffer- 
ential remlforcement, $+ corresponds to $1 and S- 
to S2. 

Translating from the stimulus-compounding to the 
stimulus-generalization paradigm, the theory predicts 
that the probability of the peak shift and an inhibi- 
tory gradient is maximized when S1 controls a higher 
response rate and reinforcement frequency than S82. 
When Sl] is associated with a change in only one of 
these variables, the probability of the peak shift and 
an inhibitory gradient is less. Finally, when S1 is dis- 
criminative for an increase in the response rate, but 
signals a decrease in the frequency of reinforcement, 
the probability of the peak shift and inhibitory gradi- 
ent is minimized or eliminated. 
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The frequency of reinforcement during $1 and S2 
has been specified and carefully controlled in many 
experiments. Therefore, the frequency of reinforce- 
ment rather than a direct measure of the incentive 
value or response strength is employed in the analy- 
sis of these experiments. The assumption made in this 
analysis is that the frequency of reinforcement is di- 
rectly related to incentive value and may therefore 
serve as an estimate of the incentive value, although 
the correlation between the two remains to be veri- 
fied. It is reasonable to assume that other variations 
in reinforcement affecting preference—e.g., delay, mag- 
nitude, and intensity—would act similarly in deter- 
mining incentive differences. 

Table 1, which is an adaptation of one Weiss (1974) 
presented in his analysis of response and incentive 
variables in the results of stimulus compounding, 
presents the results of stimulus-generalization tests 
following discrimination training in a number of 
studies. ‘hese experiments are classified with respect 
to the reinforcement and response rates in $2 as com- 
pared to Sl. SI is the stimulus associated with con- 
stant conditions of reinforcement during prediscrimi- 
nation and discrimination training. $2 is the stimulus 
associated with a change in the conditions of rein- 
forcement and/or response requirements, whether 
these changes are increases or decreases. 

The response rate and reinforcement frequency 
in $2 may each decrease, increase, or remain unchanged 
during the acquisition of the discrimination. This 
yields a 3 & 3 matrix with nine cells. Each cell is sub- 
divided according to whether intra- or interdimen- 
sional training was employed. For each experiment, 
the schedule associated with $1 is given first, followed 
by S2 in parentheses. The outcomes predicted by ex- 
trapolation of Weiss’s theory are indicated within 
each cell as the probability of the peak shift follow- 
ing intradimensional training. The probability of an 
inhibitory gradient following interdimensional train- 
ing is also presented. Each cell contains investiga- 
tions meeting the response and reinforcement rela- 
tions indicated for $1 and S2. In general, the fit is 
excellent, and, with the exceptions noted in the foot- 
notes, the experiments support the predictions from 
Weiss’s model. 

The upper left cell, in which both the response 
rate and frequency of reinforcement during S2 de- 
creased, was the starting point of this research effort. 
Hanson's (1959) original experiment, where respond- 
ing in S2 was reduced through extinction while a VI 
schedule remained operative in SI, is the extreme ar- 
rangement for this cell. Guttman (1959) extended 
Hanson’s original experiment on the peak shift with 
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intradimensional discrimination training on a mult 
VI I-min VI 5-min schedule instead of a mult VI 
I-min extinction schedule. A peak shift away from 
the stimulus associated with the VI 5-min schedule 
was obtained during a stimulus-generalization test. 
The shape of the postdiscrimination gradient in Gutt- 
man’s experiment was virtually identical to that ob- 
tained in Hanson’s experiment, so the VI 5-min 
schedule was as effective as extinction in producing 
the peak shift. This result was confirmed by Terrace 
(1968) and by Wheatley and Thomas (1974). Since the 
peak shift was obtained when each discriminative 
stimulus was associated with a schedule of reinforce- 
ment, extinction during $2 is not necessary for the 
production of the peak shift. 

Using intradimensional training Dysart, Marx, Mc- 
Lean, and Nelson (1974) systematically varied the 
frequency of reinforcement during $2 from VI 1-min 
through VI 5-min to extinction using a separate 
group of pigeons for each condition. Within each 
group, the probability of obtaining a peak shift away 
from 52 increased as a function of the decrease in the 
relative frequency of reinforcement associated with 
S2, Since the correlation between the rate of re- 
sponding and the rate of reinforcement during S2 was 
+.97, it was not possible to specify which of these fac- 
tors was responsible for the peak shift. 

Among the most impressive evidence for a rela- 
tive interpretation of the peak shift was a study by 
Wheatley and Thomas (1974) in which four of six 
pigcons showed a peak shift away from the VI 24-sec 
component of a mult VI 12-sec VI 24-sec schedule. 
The best predictor of the peak shift in their experi- 
ment was a discrimination index, responses to S1 
divided by total responses. A poor discrimination 
index in the study by Yarczower, Dickson, and Gollub 
(1966) may explain why this study was the only fail- 
ure to obtain a peak shift for the experiments in the 
upper left cell of Table 1. Wheatley and Thomas con- 
clude that, given a good index of discrimination be- 
tween SI and S2, a peak shift is obtained away from 
SZ when S2 is associated with a high frequency of 
reinforcement, provided S1 is associated with a still 
higher frequency. 

The Wheatley and Thomas study was carried out 
with a two-component multiple schedule. Suppose, 
following Weiss, that a three-component schedule is 
employed—e.g., mult VI 12-sec VI 24-sec EXT, with 
extinction associated with the absence of the discrim- 
inative stimuli. By this temporal manipulation of the 
stimulus context it seems possible that the peak shift 
would not be obtained. 

Weisman (1969) ran a parallel interdimensional 
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PREDICTION 


INTRADIMEN- 
SIONAL TRAINING 


PREDICTION 
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SIONAL TRAINING 


PREDICTION 
IN TRADIMEN- 
SIONAL TRAINING 


PREDICTION 


INTERDIMEN- 
SIONAL TRAINING 


PREDICTION 


INTRADIMEN- 
SIONAL TRAINING 


PREDICTION 


INTERDIMEN- 
SIONAL TRAINING 


RESPONSE RATE IN S2 COMPARED WITH S11 


Decrease 


MAXIMUM PEAK SHIFT 
VI I’ (VI 5’) 
Guttman (1959) 
Terrace (1968) 
Wheatley & ‘Thomas (1974) 
VI 30” (VI I’), 
VI I’ (VI 2’) or (VI 5’) 
Dysart et al. (1974) 
VI 30”-DRL 4” (VI 4- 
DRL 8”) 
Yarczower et al. (1966)8 
VI 12” (VI 24”) or (VI 60”) 
Wheatley & Thomas (1974) 


MAXIMUM INHIBITORY 
GRADIENT 
VI lV’ (VI 54 
Weisman (1969) 


MODERATE PEAK SHIFT 
VI I’ (DRL 6”) 
VI I’ (VI V’ + punishment) 
Terrace (1968) 
VI V (DRO 50”) 
Varczower et al. (1968) 
VI 1-6” Reinf. (VI 1-2” 
Reint.) 
Mariner & Thomas (1969) 
VI’ (VI I’ + delay) 
Wilkie (1972) 
VIV (VT 1) 
Huff ct al. (1973) 


MODERATE INHIBITORY 
GRADIENT 

VI I’ (DRL) or (DRO) 
Weisman (1969, 1970) 

VIV (VT 1’ 
Weisman & Ramsden 
(1973) 

VI 1’ (VI I’ + delay) 
Richards (1973) 


MINIMAL PEAK SHIFT 


VI 30’-DRL 4” (DRO 10”) 
Yarczower et al. (1968) 


MINIMAL INHIBITORY 
GRADIENT 


No data 


aA peak shift was not obtained. 


b Only one of the three birds showed a peak shift. 
The Schedule Associated with S1 is Given First Followed by S2 in Parentheses. 
SOURCE: Predictions adapted from Weiss’s (1974) two-factor combinational model of stimulus control. 


No Change 


MODERATE PEAK SHIFT 


VI 2.5° (V5) 
Wheatley & ‘Thomas 
(1974) 

VI 30”—-DRL 4” (VI 3” 
DRL 2” 
Yarczower et al. (1966)4 


MODERATE INHIBITORY 
GRADIENT 


No data 


MINIMAL PEAK SHIFT 


VI I’ (DRL 6”) 
‘Terrace (1968) 


MINIMAL INHIBITORY 
GRADIENT 
VIV (VT 1) 
Weisman & Ramsden 


(1973) 


MODERATE PEAK SHIFT 
No data 


MODERATE INHIBITORY 
GRADIENT 


No data 
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Results of Stimulus-generalization Experiments Classified According to Response Rate and Reinforcement Rate Differ- 


Increase 


MINIMAL PEAK SHIFT 
No data 


MINIMAL INHIBITORY 
GRADIENT 


No data 


MODERATE PEAK SHIFT 
No data 


MODERATE INHIBITORY 
GRADIENT 


No data 


MAXIMAL PEAK SHIFT 
VI 5’ (VI 1’ 
Terrace (1968) 
VI 1-2” Reinf. 
(VI 1’-6” Reinf.) 
Mariner & ‘Thomas (1969) 


MAXIMAL INHIBITORY 
GRADIENT 


No data 
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experiment on a multiple VI l-min and VI 5-min 
schedule. A shallow inhibitory gradient on the dimen- 
sion of line orientation was obtained for each of the 
animals where the rate of responding to S2 was re- 
duced by a shift from VI I-min to VI 5-min. If the 
determinants of the peak shift and the inhibitory 
gradient are identical, then, extending Wheatley and 
Thomas (1974), some pigeons should demonstrate an 
inhibitory gradient around a stimulus associated with 
a VI 24-sec schedule following interdimensional train- 
ing on a mult VI 12-sec VI 24-sec schedule. Additional 
data are needed to determine the limits of inhibitory 
stimulus control following interdimensional training 
with various schedules associated with S1 and S2. 

Spence viewed inhibition in absolute rather than 
relative terms and always identified inhibition with 
extinction. One of the major conclusions from the 
experiments in the upper left cell in Table 1 is that 
the peak shift and inhibitory stimulus control are 
determined by the relative rates of reinforcement 
during S1 and S2. Richards’s unpublished experiment 
demonstrated that $2, a stimulus associated with a VI 
I-min schedule in which reinforcement was delayed 
for 10 sec, was inhibitory and produced a decrement 
in responding when presented simultaneously with S1 
in a combined-cues test. The amount of inhibition 
was less than was obtained when S2 was associated 
with extinction. These data reduce the appeal of 
Spence’s original formulation and require a relative 
theory which assumes that S2 becomes an inhibitory 
stimulus when it is associated with a less favorable 
schedule of reinforcement. 

In the experiments in the upper left cell of Table 
I, the reduction in the frequency of reinforcement 
during $2 was confounded with a reduction in the 
rate of responding during $2. Therefore, the next 
logical step was to hold the frequency of reinforce- 
ment constant during S1 and S2 while reducing the 
rate of responding in the presence of $2. The experi- 
ments in which this procedure was employed are in- 
dicated in the middle cell of the “Decrease” column 
in Table 1. The evidence from these experiments is 
quite consistent: both the peak shift and inhibitory 
gradient are obtained. Therefore, reduction in rein- 
forcement frequency during S2 is not necessary for 
these phenomena. Reduction of the response rate 
alone might be adequate. However, an equal fre- 
quency of reinforcement during S1 and S2 does not 
imply that the incentive values of S1 and S2 were 
identical. Although the relative incentive values of 
Sl and S2 were not measured in any of these experi- 
ments, it 1s likely that S1 is preferred to S2 when each 
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response to S2 is punished (Terrace, 1968), when the 
magnitude of reinforcement during S2 is smaller 
than S1 (Mariner & Thomas, 1969), when reinforce- 
ment during S2 is delayed (Richards, 1973; Wilkie, 
1972), or when reinforcement during S2 is indepen- 
dent of responding (Huff, Sherman, & Cohn, 1975). 
Additional data are needed to determine if VI sched- 
ules are preferred to the DRL, (DRO), and VT 
schedules employed by Terrace (1968), Yarczower, 
Gollub, and Dickson (1968), Weisman (1969, 1970), 
and Weisman and Ramsden (1973). No data are avail- 
able to determine if the peak shift and inhibitory 
stimulus control are obtained when the relative in- 
centive values of S1 and S2 are identical. 

A study by Yarczower et al. (1968), in the lower left 
cell of ‘Table 1, suggests that the peak shift does not 
occur when the frequency of reinforcement during S2 
is increased, but the rate of responding is decreased. 
The paucity of data in this cell indicates the lack of 
research under these conditions. 

The “No Change” column of Table 1 presents data 
for experiments in which the rate of responding dur- 
ing SI was identical to the rate during $2 so that 
differential responding between these stimuli was not 
present prior to the stimulus-generalization test. The 
data are remarkably consistent. Independently of the 
reinforcement rate during $2, the peak shift and in- 
hibitory stimulus control are not obtained. 

Weiss predicts that incentive differences alone 
should produce a moderate probability of the peak 
shift and inhibitory stimulus control when the rates 
of responding to S] and S2 are identical. The Wheat- 
ley and Thomas (1974) and Yarczower et al. (1966) 
experiments fail to support this prediction, but we do 
not Know if these subjects were discriminating the 
reinforcement differences between components, a nec- 
essary precondition for establishing different incen- 
tive values. 

The “Increase” column of Table 1 indicates that 
in only two experiments has the discrimination been 
acquired by increasing the response rate during S2 
while a lower rate is maintained during $1. The 
lower right cell indicates that the peak shift has been 
obtained by increasing both the frequency of rein- 
forcement and response rate during S2 in the experi- 
ments of ‘Terrace (1968) and Mariner and Thomas 
(1969). However, the data from these experiments are 
not as consistent as those in the upper left cell of 
Table 1. Perhaps this is because it is easier to estab- 
lish a discrimination between S1 and S2 by decreasing 
rather than increasing the rate of responding during 
S2. 
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In general, these data support the conceptual 
framework developed by Weiss (1971, 1972, 1974) 
through his work in stimulus compounding, suggest- 
ing that incentive and rate differences between SI and 
S2 combine to produce the peak shift and inhibitory 
stimulus control. Thus when the incentive parameters 
of SI and S2 are held constant, the greater the index 
of discrimination between S1 and S2, the higher the 
probability of obtaining the peak shift or inhibitory 
stimulus control. When S1 and S2 control comparable 
response rates, this formulation predicts that the 
greater the incentive difference between S1 and S2, 
the higher the probability of obtaining the peak shift 
or inhibitory stimulus control. However, since it ap- 
pears that these two variables might combine, the 
greatest probability of peak shift and the largest in- 
hibitory effects are predicted for the upper left and 
lower right cells on Table 1 where large incentive and 
rate differences between Sl] and S2 are established 
during discrimination training. Although this analy- 
sis was limited to positive reinforcement, similar re- 
sults are predicted for responding maintained by 
negative reinforcement. The extent to which these 
variables determine the peak shift and inhibitory 
stimulus control awaits the results of further para- 
metric research, but a glance at able | indicates a 
remarkable consistency in the data from many difter- 
ent laboratories. 


(CONCURRENT SCHEDULES 


In a concurrent schedule, two discriminative stim- 
uli are presented at separate locations and responding 
to cach stimulus is reinforced independently, Con- 
current schedules are ideally suited for experiments 
designed to measure the influence of incompatible 
responses on the peak shift and inhibitory stimulus 
control. Although Spence’s theory of discrimination 
learning was designed for the simultaneous paradigm, 
one of the ironies of research based on his theory is 
that some of the data that best fit Spence’s theory 
were obtained from a successive paradigm. 

Catania (1969) has developed a theory of inhibition 
which removes much of the mystery from this phe- 
nomenon by specifying the experimental procedures 
which produce the inhibited and inhibiting events. 
Consider the case in which a stimulus is presented on 
each of the left and right keys and responding on each 
key is reinforced according to a concurrent schedule. 
According to Catania, reinforcement of a response on 
the left key inhibits responding on the right key by 
increasing the probability of a further response on the 
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left key. Similarly, reinforcing a response on the right 
key inhibits responding on the left key. As Catania 
puts it, “the rate of a reinforced response is inhibited 
by the reinforcement of other responses” (p. 741). ‘The 
advantage of Catania’s formulation is that it treats 
reinforcement (e.g., of a response on the left key) as the 
causal variable which produces inhibition (of re- 
sponding on the right key). However, the “inhibition” 
of one response by the reinforcement of another does 
not necessarily imply that the stimulus controlling 
that response, or the lack of it, is an inhibitory stim- 
ulus. This requires an independent test for an in- 
hibitory stimulus. 

D. Blough (1973) obtained a peak shift on both the 
right and left keys following successive intradimen- 
sional training by reinforcing a response on the oppo- 
site key during S~. On the right key, stimuli of 550 
and 559 nm were presented alternately as in the usual 
multiple schedule. ‘The left key was always illumi- 
nated with a white diamond, Responses on the right 
key were reinforced in the presence of 550 nm and 
extinguished in the presence of 559 nm. On the left 
key the contingencies were reversed. Responses on the 
left key were reinforced in the presence of 559 nm 
(when it appeared on the right key) and extinguished 
in the presence of 550 nm. After discrimination train- 
ing, two stimulus-generalization gradients were ob- 
tained by recording the number of responses te the 
left and right keys while the wavelength of the stim: 
ulus on the right key was varied. 

The left panel of Figure 17 shows the gradients 
obtained from responses on the left key, and the 
right panel shows the gradients from the right key. 
The left panel of Figure 17 shows a peak shift for 
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Fig. 17. (Left) Left-key generalization gradients from individual 
birds trained with VI reinforcement for pecks at the left key 
during 559 (St) and extinguished during 550 (S-). On the left 
key, a positive peak shift with more responding to the longer 
wavelengths occurred. (Right) Right-key generalization gradients 
from individual birds reinforced for pecks on the right key 
during 550 nm (S*) and extinguished during 559 nm (S-). On 
the right key, a positive peak shift with more responding to the 
shorter wavelengths occurred. (From D. Blough, 1973.) 
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each of the birds with the maximum displaced from 
S+ (559 nm) toward a longer wavelength. The right 
panel shows a peak shift for each of the birds with 
the maximum displaced from S+ (550 nm) toward the 
short wavelengths. 

In a similar experiment with interdimensional 
training, Catania, Silverman, and Stubbs (1974) ob- 
tained an inhibitory gradient on the left key by rein- 
forcing responses on the right key when S~ was pres- 
ent on the left key. Inhibitory processes have also 
been successfully investigated by Beale and Winton 
(1970), Winton and Beale (1971), and Honig et al. 
(1972), who measured generalization with a concur- 
rent procedure in which an ‘“‘advance” or changeover 
response was employed. 

The experiments of D. Blough (1973) and Catania, 
Silverman, and Stubbs (1974) provide excellent sup- 
port for Catania’s theory of inhibition of one response 
by reinforcement of another response. However, it 1s 
important to emphasize that in Blough’s experiment 
the peak shift was obtained when S+ alternated with 
S~ in a successive discrimination on the same key. It 
is interesting to note that the peak shift was not ob- 
tained in experiments—e.g., Honig (1962, 1967)—in 
which S+ and S~ were presented on different keys. 
When S+ was located on one key and S~ on a differ- 
ent key, the probability of the peak shift might have 
decreased as a function of the distance between the 
keys, since this made the comparison of S* with S— 
more difficult, The advantage of the concurrent pro- 
cedures, which reinforce a response on the left key 
when extinction is in effect on the right key, is that 
the competing response is brought under experimen- 
tal control. 


ERRORLESS LEARNING RECONSIDERED 


The previous sections described the influence of 
discrimination training on the shape of the generall- 
zation gradient and demonstrated the variables that 
have been established as the determinants of the peak 
shift and inhibitory stimulus control. This section 
considers the influence of a special type of discrimina- 
tion training, “errorless” discrimination training, on 
stimulus control and inhibitory processes. Errorless 
learning refers to the class of discriminations in which 
the rate of responding to S~ approaches zero from the 
first session of discrimination training. The major 
reason for the extensive interest in errorless learning 
is that it is considered an exception to the basic laws 
of discrimination learning developed in the previous 
sections. 
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What is Errorless Learning? 


THE DEFINITION OF 
ERRORLESS LEARNING 


The “error’’ of errorless learning is the occurrence 
during S~ of the response that is reinforced during 
S+. The definition of an error depends upon the cri- 
teria employed for defining response. Consider the 
pecking response of the pigeon, which is the typical 
response in most experiments in errorless learning. 
This response consists of an orientation toward the 
key, an approach to the key, and a peck on the key. 
A response is usually defined automatically by a force 
of about 20 g on the key, and other components of 
the response are typically not recorded. The number 
of errors may be increased by relaxing the criterion for 
a response to include approach responses which do 
not contact the key, and the number of errors may be 
decreased by increasing the force of the peck required 
to activate the key or by restraining the animal from 
contacting the key. For example, a pigeon which 
made zero pecks on the key could be converted into a 
subject which learned with errors by counting near- 
misses, pecks around the key, etc. 

Ethological analyses using procedures developed 
by Staddon and Simmelhag (1971) reveal a number of 
potentially interesting responses and response se- 
quences which are modified during errorless discrim- 
ination training. These analyses indicate that error- 
less discrimination training procedures are effective 
in eliminating the terminal peck response from the 
sequence of observing and approach responses to S~. 
For example, Wessells (1974, Experiment TlI) ob- 
tained an errorless discrimination without fading us- 
ing an autoshaping procedure. On half of the trials, 
a white light, (CS+), on the left key was immediately 
followed by food. On the other trials, a green key 
light, (CS—), was not followed by food. Wessells re- 
corded three key-directed behaviors during each trial: 
(1) an orienting response described as “looking at the 
key,” (2) an approach response defined as any move- 
ment toward the key, and (3) the peck at the key. 
Wessells found that the emergence of the pecking re- 
sponse was always preceded by an orientation-approach 
sequence. During S~ isolated approach responses in- 
creased rapidly during the first few trials of dis- 
crimination training and then decreased, while during 
S+ the approach toward the key consistently increased. 
The orientation response behaves exactly as the tra- 
ditional theory of discrimination predicts. During S*, 
the orientation response is maintained by reinforce- 
ment, while the response extinguishes during S~. 
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In order to obtain more data on the behaviors 
which are observed during the acquisition of an error- 
less discrimination, Rilling, Caplan, and Brown’s un- 
published study recorded the three responses specified 
by Wessells during errorless autoshaping and subse- 
quent sessions of discrimination training in which the 
duration of each component was gradually increased 
to | min. After autoshaping, responses during S+ were 
reinforced on a VI 30-sec schedule. For 10 of the 12 
birds in the experiment, the following sequence of 
responses emerged during successive S+ trials: an ob- 
serving response appeared earliest, followed by an 
approach to the key, which was finally followed by 
pecks on the key. A hierarchy of responses was ob- 
served during 5—. For each of the birds, the probabil- 
ity of an observing response was greater than the sum 
of the approach and key-peck responses during S—. 
The difference in behavior between $+ and S— is that 
reliable pecking does not emerge from the approach 
response during S~. Although many of the birds were 
errorless with respect to pecks on the key, they were 
not “errorless’” with respect to the response of ap- 
proaching the key. The responses for one of the error- 
less pigeons are presented in Figure 18. 

Since the definition of an error response during S— 
is arbitrary, it follows that the definition of errorless 
learning is also arbitrary. Researchers of errorless 
learning tend to differentiate cases of learning with- 
out errors from cases of learning with errors based 
upon the absolute number of errors produced through: 
out the experiment. Terrace (1972) described subjects 
exhibiting few errors (on the order of 25 or fewer for 
the parameters of his experiment) as performing 
fundamentally the same as stibjécts which made no 
errors at all, Both groups were labeled “errorless,’’ as 
they appeared distinct from subjects which made sub- 
stantially more nonreinforced responses during S-. 

This classification is theoretically useful if the per- 
formance of those subjects which make zero or a few 
errors, however defined, during S— is fundamentally 
different from those subjects which make many errors. 
The data reviewed in this section demonstrate that 
this classification is not useful, because few phenomena 
of discrimination learning depend reliably upon the 
rate of responding during S—. Therefore, the distinc- 
tion between learning without errors and learning 
with errors is arbitrary. A more fruitful strategy may 
be to isolate those variables that determine the rate 
of responding during S~ and to develop theories that 
predict these rates. This strategy avoids the embar- 
rassing question of how many errors are necessary be- 
fore errorless learning becomes errorful learning. 
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Fig. 18. Number of responses pecking the key, appreaching the 
kay, and é¥iénting toward the key during S+ and §- for bird 
5189 during the acquisition of a “errorlese diseriminatisn.” 
During 5° orienting and approaching the key preceded the 
emergence of pecking, while during 3- orienting and appreach- 
ing tha kay were not followed by pecking. (From Railling, Cap- 
lan, & Brown, unpublished data.) 


ERRORLESS LEARNING AS THE 
"TRANSFER OF STIMULUS CONTROL 


One of the most potent determinants of the num- 
ber of errors during the acquisition of a discrimina- 
tion is the physical similarity between S+ and S~. For 
example, when S+ was a green and S~ a dark key, 
Kodera and Rilling (1975) obtained fewer errors than 
when S+ was a green and S~ a red key. As Terrace 
(1973) points out, the failure to respond to the dark 
key is presumably a transfer from the discrimination 
training that took place in the animal’s environment 
prior to the experiment. Pigeons are reinforced for 
pecking at brightly colored bits of grain, but are not 
reinforced for pecking at dark holes. ‘These experi- 
ments suggest that interdimensional training often 
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produces fewer errors than intradimensional training 
because of the greater similarity of the stimuli in the 
intradimensional case. 

Therefore, special techniques are necessary to es- 
tablish an errorless discrimination when S+ and S- 
are similar as in intradimensional training. In fading, 
a property of a stimulus is gradually changed on suc- 
cessive trials to transfer control of responding from 
one property of a stimulus to another. For example, 
in the experiment which initiated the current interest 
in errorless learning, Terrace (1963a) first established 
an errorless discrimination between a red S+ and a 
dark S— by beginning with a dark key with a dura- 
tion of | sec. In Phase 1, the duration of the dark key 
was lengthened from | to 30 sec. In Phase 2, the dis- 
crimination was transferred from a dark key, S—, to 
a green S— by reducing the duration of the dark key 
to | sec and gradually increasing the intensity of the 
chromatic S~ to match that of $+. In Phase 3, the 
duration of the green S~ was gradually increased 
from | to 30 sec to match the duration of S+. 

Reprettably, fading remains a part of the art rather 
than the science of operant conditioning. Many in- 
vestigators—e.o., Rilling, Kramer, and Richards (1973), 
Karpicke and Hearst (1975)—have obtained more er- 
rors than ‘Terrace reported in attempts to obtain er- 
rorless discriminations. The parameters of fading 
which are necessary for errorless learning remain un- 
investigated. How rapidly should the intensity and 
duration of S~ be increased in order to obtain opti- 
mal errorless discrimination learning? Furthermore, 
the effectiveness of fading is rarely compared with 
appropriate control conditions in which fading is not 
employed. ‘The procedural variables which are re- 
sponsible for errorless discrimination learning have 
been neglected in the rush to compare errorless learn- 
ing with learning with errors. 

In a second experiment Terrace (1963b) compared 
three procedures for transferring stimulus control 
from the wavelength dimension to the line-orientation 
dimension. In the first phase of the experiment, the 
pigeons acquired an errorless discrimination between 
red (S+) and green (S—), and in the second phase the 
discrimination was between a vertical line (S+) and a 
horizontal line (S—). Transfer was accomplished by 
several procedures. In the abrupt procedure, the red 
stimulus on the S+ trials was replaced with a vertical 
line and the green stimulus on the S~ trials was re- 
placed with a horizontal line. In the superimposition- 
only procedure, the vertical line was superimposed on 
red and the horizontal line was superimposed on 
green for five sessions before the lines were presented 


STIMULUS CONTROL AND INHIBITORY PROCESSES 


alone. In the superimposition and fading procedure, 
the vertical and horizontal lines were superimposed 
upon the chromatic stimuli for five sessions, but the 
intensities of the red and green lights were gradually 
reduced within one session until they were no longer 
visible to the human observer. A fourth group re- 
ceived discrimination training between the vertical 
and horizontal lines without pretraining on the dis- 
crimination between red and green. 

The results of this experiment are presented in 
Figure 19, which shows that the total number of 
errors during S~ depended upon the method of 
transfer from the red-green to the vertical-horizontal 
discrimination. The superimposition and fading pro- 
cedure produced errorless transfer, while the abrupt 
procedure produced the most errors. This experiment 
demonstrates that both fading and superimposition 
independently facilitate transfer of stimulus control 
and supports an interpretation of errorless learning 
as a transfer of stimulus control from one dimension 
to another. 
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Fig. 19. The number of errors made by each bird during the 
acquisition of a  vertical-horizontal discrimination following 
errorless red-green discrimination training. Errors during a 
second series of red-green discrimination sessions are also indi- 
cated. (From Terrace, 1963b. © 1963 by the Society for the 
Experimental Analysis of Behavior, Inc.) 
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"TERRACE’S ‘THEORY OF 
ERRORLESS LEARNING 


Errorless learning has been observed occasionally 
for many years, but has remained little more than a 
laboratory curiosity and incidental observation. For 
example, Skinner (1938, pp. 203-206) established a 
brightness discrimination in rats without errors by 
introducing S— during the first session of training. 
Errorless learning was also observed by Schlosberg 
and Solomon (1943). 

The popularization of programmed instruction by 
Skinner in the late fifties and early sixties led to in- 
tensé interést in errorless learning. Skinner argued 
that one of the major advantages of programmed in- 
struction was that the student is seldom wrong. In 
1961, he eloquently advocated training procedures 
which produce as few errors as possible. This was 
accomplished through stimulus generalization by 
changing each successive question only slightly from 
one frame to the next. Although Skinner offered no 
data, errors presumably had aversive consequences 
which might lead to escape behavior. Skinner’s theory 
that optimal learning is accomplished by maximizing 
positive reinforcement and minimizing aversive con- 
sequences forms a background for the theory of errors 
less learning developed by ‘Terrace. 

Most of the research in érrorless learning is not 
concerned with the transfer of stimulus control. 
Rather, it is concerned with testing Terrace’s theory 
that the phenomena of errorless learning are funda- 
mentally different from the phenomena obtained 
when the discrimination is acquired with errors. 

Terrace has consistently argued that errorless learn- 
ing is fundamentally different from learning with 
errors. In discussing the results of his original experi- 
ment, Terrace (1963a) wrote, “Discriminations ac- 
quired with zero, or a near zero, number of responses 
to S~, can be clearly distinguished from discrimina- 
tions acquired with large amounts of responding to 
S— by criteria other than the number of acquisition 
responses to S—”’ (p. 23). The theory was subsequently 
refined by ‘Terrace (1966b) to specify that S— functions 
as a neutral stimulus following errorless discrimina- 
tion learning, while S— functions as an aversive and 
inhibitory stimulus following learning with errors. A 
similar distinction between the two different types of 
discrimination learning is maintained in the most 
recent version of this theory (Terrace, 1972). 

Terrace distinguishes the product of discrimination 
learning from the by-product of discrimination learn- 
ing. The product of differential reinforcement is the 
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acquisition of a higher rate of responding to $+ than 
to S~. For convenience, Terrace (1972) has classified 
other behavioral phenomena which occur during the 
acquisition of a discrimination as by-products of dis- 
crimination learning. The by-products of discrimina- 
tion learning include behavioral contrast, emotional 
and aggressive behavior during S—, aversive proper- 
ties of S—, inhibitory properties of S—, and the posi- 
tive and negative peak shift. Terrace (1972) set out to 
assess whether each of the by-products had similar 
underlying mechanisms which would be the case if 
they covaried as a function of the same variables. 
Terrace’s theoretical position 1$ simply stated; “None 
of [the by-products of discrimination learning] occurs 
following discrimination learning without errors” (p. 
251). This is reiterated in each of his papers on error- 
less learning (Terrace, 19639, 1963b, 1964, 1966a, 
1966b, 1966c, 1968, 1972, 1973}. As the review of the 
literature in this section will show, cach ef the by- 
products of discrimination learning, except the peak 
shift, has been obtained following errorless discrimi- 
nation learning. ‘Therefore, a revision of Terrace’s 
theory is required. 

Rilling and his students (Kodera & Rilling, 1975: 
Rilling & Gaplan. 1973, 1975: Rilling, Gaplan, Hows 
ard, & Brown, 1975: Rilling, Kramer, %& Richards, 
1973) have cavviéd out A sé¥iés of expériments which 
demonstrate that the by-preducts of discrimination 
learning bear little relationship to the occurrence or 
nonoccurrence of errore during $S~. ‘These experi- 
ments suppest that the parameétérs which most read- 
ily produce ong by-preduct—c.g. extinction-induced 
aggression—may not produce another—e.g., behavioral 
contrast. Only the peak shift and inhibitory stimulus 
control appear to have identical underlying mech- 
anisms. ‘Therefore, the theory which explains behav- 
ioral contrast may differ from the theery that explains 
extinction-induced aggression, and so on. It follows 
that the best strategy for research is to identify those 
variables which produce each behavioral phenomenon 
and to develop separate theories for each. Whether a 
behavioral process is classified as a by-product or not 
is irrelevant for the task of isolating its determinants. 


What Is Learned in Errorless Learning? 


This section demonstrates how errorless learning 
modifies the behavior of the organism. The experi- 
ments were designed to determine if it is necessary to 
modify traditional theories of discrimination learn- 
ing for the errorless case. The procedures employed 
are: (1) observations of aggressive behavior, (2) escape 
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from S~, and (3) assessment of inhibitory stimulus 
control. 


AGGRESSIVE BEHAVIOR DURING 
ERRORLESS LEARNING 


Azrin, Hutchinson, and Hake (1966) developed the 
standard procedure for measuring extinction-induced 
ageression between pigeons in the experimental lab- 
oratory. They alternated periods in which each re- 
sponse was reinforced with periods of extinction. In 
addition to the experimental pigeon, a second, par- 
tially restrained pigeon was also present at the rear 
of the éxpérimental chamber, Azrin et al. found that 
the probability of attack against the target pigeon 
was low when the opportunity to é¢at was available. 
However, when the opportunity for recinforcemeni 
was withdrawn, the probability of aggression was 
high. The duration of attack was a direct function of 
the number of food reinforcements and decreased as 
4 function of the time since the withdrawal of the 
opportunity to eat. With a high degree of consis- 
tency, they found that attack occurs at the moment of 
transition from reinforcement to extinction, which 
Ied them to suggest that the interruption of cating 
produced the aggression. 

In the typical procedure for measuring aggres- 
sion during extinction, a substantial number of re- 
sponses on the key are observed. ‘This raises the ques- 
tion of whether the aggression is produced from the 
frustration which occurs as a by-product of the non- 
reinforced responses, as Terrace argued (1966c, 1971, 
1972), or by the withdrawal of the opportunity for 
reinforcement. If aggression occurred during an error- 
less S—, then nonreinforced responses are not neces- 
sary for this species-specific behavior. This would 
argue instead that withdrawal of reinforcement is the 
crucial determinant of aggression. In order to obtain 
quantitative data on the phenomenon of extinction- 
induced aggression during errorless learning, Rilling 
and Caplan (1973) trained pigeons to discriminate 
without errors between a green light as S+ and a dark 
key as S~. The opportunity to attack a restrained tar- 
get bird was also present. During discrimination train- 
ing, the rate of attack in the presence of the dark key 
was higher for each animal than the operant level, 
even though most of the animals acquired the dis- 
crimination without errors. Furthermore, the rate of 
attack did not decrease during 45 sessions of discrimi- 
nation training. These data demonstrate that attack 
during S— also occurs during errorless discrimination 
training and fails to confirm Terrace’s theory. 

The procedure developed by Azrin et al. only de- 
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tects the final component of the attack sequence. 
Rilling and Caplan (1973) photographed some of the 
species-specific precursors of the attack response, flight 
behaviors, and the actual attack response for two birds 
that were errorless throughout the experiment. For 
pigeons, the first response in the sequence is bowing, 
which frequently precedes attack. In bowing, follow- 
ing erection of the head and body, the bird ruffles the 
feathers of the neck and bows the head toward the 
ground while walking in circles and emitting cooing 
calls. The second response, illustrated in part A of 
Figure 20, is attack intention in which, while stand- 
ing upright, the bird raises the feathers of its neck 
and pecks in an open space in front of its opponent 
while vibrating its wings. The final response is attack 
itself, Part B shows an attack response for bird 5. Part 
C shows an attack response by bird 1. Part D shows 
bird 5 immediately after an attack terminated. While 
these preattack behaviors are difficult to measure au- 
tomatically, they are reliably obtained during an er- 
rorless $— in the presence of a target pigeon. 


Fig. 20. Photographs taken during S- of two pigeons that ac- 
quired a discrimination without errors. Section A_ illustrates 
attack intention and B illustrates attack for bird 5. Section C 
shows attack for bird 1 and D shows bird 1 shortly after an 
attack response. (From Rilling & Caplan, 1973. © 1973 by the So- 
ciety for the Experimental Analysis of Behavior, Inc.) 
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The data of Azrin et al. (1966) and Rilling and 
Caplan (1973) suggest that the withdrawal of the 
opportunity for reinforcement is one of the primary 
determinants of extinction-induced aggression. The 
probability of attack is highest immediately following 
the termination of S+ and decreases thereafter. Fur- 
thermore, high rates of attack during extinction oc- 
curred only during those sessions in which $+ and S— 
alternated, but did not occur in sessions in which S~— 
was presented alone. Extinction per se did not induce 
aggression. ‘Chis suggests that the less the incentive 
contrast between St+ and S~-, the less the amount of 
ageression obtained. In a subsequent experiment, 
Rilling and Caplan (1975) found that the frequency 
of reinforcement during $+ was a determinant of 
extinction-induced aggression during errorless dis: 
crimination learning. A VI 30-sec schedule induced a 
higher rate of attack during extinction than a VI 
5-min schedule. 

The results of these experiments demonstrate that 
the aggression-inducing properties of S= are not pri- 
marily due to the contingencies of reinforcement pre- 
vailing during S—, but are a contrast effect determined 
by the contingencies prevailing during St. 


ESCAPE FROM S-— 


Rilling, Askew, Ahlskog, and Kramer (1969) con- 
ducted a series of experiments which demonstrated 
that an escape paradigm can be used to detect the 
aversive properties of S~ in a successive discrimina- 
tion. In their procedure, a successive discrimination 
was programmed on onc key, A peck on a second key 
produced a time-out which terminated S—~ or St+ and 
darkened the chamber. During the acquisition of the 
discrimination, time-outs occurred during 5-. The 
probability of a time-out was highest early in S~, a 
finding which paralleled the occurrence of attack 
behavior in a similar situation (Azrin, Hutchinson, & 
Hake, 1966; Rilling & Caplan, 1973). These data sup- 
port the assertion that the time-out response is an 
escape response and an index of the aversive proper- 
ties of the stimuli present when the response occurs. 

One of Skinner’s arguments in favor of learning 
without errors is that errors make the situation so 
aversive that the learner tends to escape from the 
environment in which learning is supposed to occur. 
Rilling, Kramer, and Richards (1973) designed an 
experiment to test directly Skinner’s interpretation 
of errorless learning using the escape procedure of 
Rilling et al. (1969). In a successive discrimination 
four groups of pigeons were trained to discriminate 
between red and green. Following the design of Ter- 
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race’s (1963) original experiment, the groups differed 
with respect to the procedure used to introduce S-: 
early-progressive, early-constant, late-progressive, and 
late-constant. ‘The aversive properties of S~ were mea- 
sured using the escape procedure of Rilling et al. 
(1969) in which a single peck at a second key termi- 
nated S— for 10 sec and darkened the chamber. 
Figure 21 shows the number of errors and time- 
outs for each animal in each of the four conditions. 
The data in the upper panel are ordered from left to 
right by ranking the animals from least to most er- 
rors. ‘he lower panel shows the number of time-outs 
for the corresponding bird in the upper panel. The 
procedures for introducing S~ had a significant effect 
on the number of responses to S~ during the acquisi- 
tion of the discrimination. The constant procedure 
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Fig. 21. (Upper panel) Total number of responses to S- for 
each animal during all sessions of discrimination training. 
(Lower panel) Total number of time-outs from S- during the 
10 final sessions of discrimination training. The arrows between 
panels indicate the mean for each group. Note the lack of 
correlation between the number of responses to S- and the 
number of time-outs from S-. (Rilling, Kramer, & Richards, 
1973.) 
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produced more errors than the progressive procedure. 
The procedures for introducing S~ also had a signifi- 
cant effect on the number of time-outs from S—. How- 
ever, here late introduction of S~ produced more 
time-outs from S— than early introduction of S-. 

In interpreting the results of an experiment in 
which pigeons escaped from S—, Terrace (1971) con- 
cluded that “the occurrence of nonreinforced respond- 
ing to S~ is the crucial factor in rendering S— aver- 
sive.” ‘Therefore, Terrace predicts that a positive 
correlation is obtained between the number of errors 
and an index of the aversiveness of S—. Figure 21 
demonstrates that, within each condition, the number 
of responses to S~ is a poor predictor of the number 
of time-outs from S~. To the extent that time-outs 
from S~ are an index of the aversive properties of 
S—, these data do not support the view that the aver- 
siveness of a stimulus is directly proportional to the 
number of unreinforced responses emitted in its pres- 
ence, 
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Independent tests for inhibition. One of the first 
experiments to suggest that an errorless S— is inhibi- 
tory was an experiment of Marsh and Johnson (1968). 
In the first phase of the experiment, an errorless succes- 
sive discrimination between red and green was estab- 
lished with a fading procedure. In the second phase 
of the experiment, the reinforcement contingencies 
for St and S~ were reversed, so that the previously 
positive stimulus was extinguished and the previously 
negative stimulus was reinforced. Four of the five sub- 
jects did not respond to $+ (the former S—) more 
than once during five days of reversal training. A con- 
trol group that did not receive errorless training 
acquired the reversal rapidly. Therefore, a history of 
errorless learning appears to retard the detection of 
changes in the reinforcement contingencies. As com- 
pared to a subject that has acquired a discrimination 
with errors, the errorless bird is at a relative disad- 
vantage in coping with an environment in which the 
conditions of reinforcement change. 

As mentioned earlier, Hearst, Besley, and Farthing 
(1970) argue that an independent test is necessary to 
demonstrate that a particular stimulus is inhibitory. 
Two procedures have been developed for such tests. 
In the combined-cues test, S~ is an inhibitory stimu- 
lus if its combination with S+ produces a decrement 
in responding. In the resistance-to-reinforcement test, 
S~ is an inhibitory stimulus if the acquisition of a 
conditioned response is retarded. Using both of these 
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tests, Wessells (1973) demonstrated that an errorless 
S— was inhibitory. 

Wessells established errorless learning with an au- 
toshaping paradigm. On a CS+ trial, the key was 
illuminated with a green light for 6 sec and always 
followed immediately by food independently of the 
pigeon’s behavior. On a CS~ trial, the key was illumi- 
nated with a white vertical line on a black back- 
ground which was never followed by food. Half of the 
trials were CS+ trials and half were CS~— trials. The 
key was dark between trials. During Phase 1, pecking 
on the key emerged during CS+ for each bird, while 
the birds were errorless during CS~ using a criterion 
of 25 or fewer errors. The tests for inhibition were 
carried out during the second phase of the experi- 
ment. 

Four groups were employed in Wessells’s experi- 
ment. Group I received 80 presentations each of CS+ 
and CS~ in Phase 1, while Group 2 received 200 
presentations each of CSt+ and CS-. In Phase 2, 
Groups | and 2 received a test in which the resistance 
to autoshaping of the former CS~ was measured. On 
half of the trials, the former CS— was presented on 
the left key while a novel stimulus was presented on 
the right key on the other trials. For each bird, a peck 
at the key with the novel stimulus emerged earlier 
than pecks at the key with the former CS-. The 
amount of inhibition was a function of the number 
of conditioning trials: the birds with 400 differential 
conditioning trials showed more inhibition than the 
groups receiving only 160 trials. 

Group 3 also received 400 differential conditioning 
trials followed by a combined-cues test in which CS-, 
the white line, was superimposed upon CS+, the green 
background. Group 4 was a control group for the 
possible unconditioned suppressive effects of the white 
line. Combining the CS— with the CS+ completely 
suppressed responding for the birds that received 
errorless differential conditioning. In contrast, com- 
bining the novel white line with CS+ produced only 
a slight decrement in responding. The results of Wes- 
sells’s experiment are a convincing demonstration 
that an errorless S~ acquires inhibitory properties. A 
negative contingency between the CS— and the un- 
conditioned stimulus (food), so that the CS— predicts 
the absence of food, appears sufficient to account for 
the development of conditioned inhibition (see 
Rescorla, 1969a). 

Using an errorless autoshaping procedure similar 
to that employed by Wessells (1973), Wilkie and 
Ramer (1974) found that an errorless S— is more resis- 
tant to autoshaping than an S— with errors. Therefore, 
as measured by the resistance-to-reinforcement pro- 
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cedure, an errorless S~ is more inhibitory than an S— 
with errors. Clearly, an errorless S~ is not a neutral 
stimulus, as was first proposed by Terrace (1966). 


Generalization following errorless intradimensional 
training. ‘The evidence that S~ remains a neutral 
stimulus following errorless learning is based on two 
experiments of Terrace (1964, 1966) in which a 
stimulus-generalization test was administered follow- 
ing crrorless discrimination learning. In one of the 
most widely cited experiments, Terrace (1964) estab- 
lished an errorless intradimensional discrimination 
with an early-progressive procedure. $+ was 540 nm 
and S- was 580 nm. During a gecncralization test 
which was administered following 14 days of discrimi- 
nation training, the peak of the gradient for all three 
errorless birds occurred at $+. When a late-constant 
procedure was employed, a peak shift was obtained 
for two of the three subjects. These data led Terrace 
to conclude that S~ is a neutral stimulus when the 
discrimination is acquired without errors. While Ter 
race attributed the difference bétween the groups to 
the number of errors during $~, an cqually plausible 
interpretation is that the probability of obtaining the 
peak shift depends upon the procedure for introduc- 
ing S$. 

The peak shift is the only phenomenon of stimulus 
control which has not yet been obtained following 
errorless discrimination training. In view of the evi- 
dence cited above that 5 functions as an inhibitory 
stimulus following errorless learning, it seems likely 
that the peak shift might occur following intradimen- 
sional errorless learning. The problem is to select 
values of St and S~ which are spaced closely enough 
to permit the observation of the peak shift yet which 
do not preclude errorless learning. 

Generalization Following Errerless Interdimen- 
sional Training. When a stimulus-generalization test 
is administered following interdimensional discrimi- 
nation Icarning, an inhibitory gradient is usually 
obtained around S~. wo experiments of ‘Terrace 
(1966b, 1972) are apparently exceptions to this gen- 
eralization. In the first experiment, a single group of 
birds were trained to discriminate between a white 
vertical line (S+) and a wavelength of 580 nm (S-) 
using traditional discrimination-training procedures. 
Substantial individual differences in the number of 
responses to S~ were obtained. Flat gradients with 
virtually zero responses to each test stimulus were ob- 
tained for those subjects that acquired the discrimina- 
tion with the lowest rates of responding to S~. Inhibi- 
tory gradients with a minimum at S~ were obtained 
for those subjects that acquired the discrimination 
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with the highest rates of responding to S-. These 
data led Terrace to conclude that inhibitory stimulus 
control was absent following errorless learning. 

‘These experiments have been extensively criticized 
by subsequent investigators. Bernheim (1968) pointed 
out that the post hoc method of dividing the subjects 
into groups with and without errors biased the error- 
less group toward pigeons that did not respond to 
wavelength and therefore virtually guaranteed a zero 
base line. ‘Terrace attempted to meet this criticism in 
a second experiment (Terrace, 1972) in which a fad- 
ing procedure was used to train the discrimination 
without errors. As was the case in the first experi- 
ment, the gradients obtained from the errorless group 
were flat. Birds that acquired the discrimination with- 
out the fading procedure résponded to S— and demen- 
strated inhibitery gradients during the stimulus-gen- 
eralization test. 

A second criticism that applies equally to both 
experiments is that virtually no résponses were emit- 
ted to each test stimulus by the errerless birds. Deutsch 
(1967) and Hearst, Besley, and Farthing (1970. p. 388) 
noted that the assessment of stimulus control by the 
$S~ dimension is ambiguous whén stimulus values far 
from S— produce zere responding, because a “floor 
effect” prevents the detection of a minimum in re- 
sponding at S~. Hearst et al. pointed out that the 
assessment of stimulus contrel by the S— dimension 
requires an elevation in the overall level of respond- 
ing to each of the stimuli during the stimulus-penar- 
alization test. If Terrace’s interpretation is correct, 
the gradient should remain flat when elevated. 

In an experiment designed to verify the Hearst et 
al, interpretation of Terrace’s experiment, Rilling, 
Caplan, Howard, and Brown (1975) compared the 
resistancé-to-reinforcement and coembined-cucs  tech- 
niques for elevating responding to the test stimuli 
from the S= dimension. Discrimination training be- 
gan with autoshaping to increase the probability of 
erroriess learning. “The positive stimulus was a4 BYES 
key light that was followed immediately by food after 
8 sec, while the negative stimulus was 4 black vertical 
or horizontal line that was never immediately fol- 
lowed by food. Autoshaping was followed by cight 
days of successive discrimination training during 
which responding to S+ was reinforced on a VI sched- 
ule. After identical conditions of discrimination train- 
ing, three different types of generalization tests were 
employed: resistance to extinction with compound- 
ing, resistance to reinforcement without compounding, 
and resistance to reinforcement with compounding. 
In the compounding tests, various line angles were 
presented on a green background. In the tests without 
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compounding, these same line angles were presented 
without the green background. ‘Terrace’s experiments 
employed resistance to extinction without compound- 
ing. 

Figure 22 presents the average number of responses 
to each of the test stimuli during days 1-5 of gen- 
eralization testing for each of the groups. An inhibi- 
tory gradient with a minimum at S~ was obtained 
for the two groups tested with the resistance-to-rein- 
forcement procedure. ‘Therefore, stimulus control was 
acquired by the S~ dimension even though the dis- 
crimination was acquired with a very low rate of 
responding to S~. More responses were obtained dur- 
ing the compound test than during the noncompound 
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test, because compounding produced responses early 
in generalization that were reinforced on the VI 
schedule. A flat gradient above the zero base line was 
obtained for the experimental group that was tested 
during extinction with the compound stimuli, in- 
dicating that inhibitory stimulus control by the angle 
of the line was not obtained for this condition of 
testing. 

These results emphasize that flat gradients with 
zero responses to each test stimulus, such as those 
obtained by Terrace (1966, 1972), are an equivocal 
outcome. Deutsch (1967) and Hearst et al. (1970) 
were correct in their initial criticism of Terrace’s ex- 
periment that a “floor effect’” prevented the detection 
of inhibitory stimulus control. A rate of responding 
to the test stimuli greater than zero is necessary for 
the assessment of inhibitory stimulus control. In the 
Rilling et al. experiment, the resistance-to-reinforce- 
ment procedure was the more sensitive index of in- 
hibitory stimulus control. Similar results have been 


obtained by Karpicke and Hearst (1975). 


Behavioral Contrast 


The phenomenon of behavioral contrast receives 
particular emphasis here, as its investigation (Terrace, 
1963a) formed the basis for Terrace’s theory of error- 
less discrimination learning. 

The four training groups—early-progressive, early- 
constant, late-progressive, and late-constant—differed 
considerably in terms of the number of responses 
emitted to S~, with the subjects of the early-progres- 
sive group displaying errorless or nearly errorless 
performance. Substantially more errors to S— were 
observed for the other three groups. Terrace subse- 
quently attributed special significance to the distinc- 
tion between discrimination learning with and with- 
out errors based on other differences between early- 
progressive birds and the others. The most basic of 
these differences was the absence of behavioral con- 
trast in the early-progressive group. From this observa- 
tion, Terrace (1972) concluded that behavioral con- 
trast does not occur if the discrimination is acquired 
without errors. 

In contrast with Terrace’s interpretation, Reynolds 
(1961), Friedman and Guttman (1965), Taus and 
Hearst (1970), Vieth and Rilling (1972), and Sadowsky 
(1973) obtained behavioral contrast from pigeons 
when a blackout, during which the chamber was 
completely dark, was employed as S~. While the num- 
ber of errors during S~ was not always recorded in 
these experiments, the blackout presumably func- 
tioned as an errorless S~, since pigeons readily dis- 
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criminate a blackout from an illuminated key as- 
sociated with reinforcement. These data suggest that 
nonreinforced responding to S~ plays a minor role in 
the production of behavioral contrast. 

Since the early-progressive group failed to display 
behavioral contrast in Terrace’s experiment, an alter- 
native interpretation, which logically must be enter- 
tained, is that some aspect of this procedure pre- 
cluded the observation of contrast. In Terrace’s 
(1963a) design, it is impossible to partition the effects 
on behavioral contrast of learning the discrimination 
with or without errors from the immediate effects of 
the specific training procedure employed. Conse- 
quently, Kodera and Rilling (1975) systematically 
replicated Terrace’s original experiment in errorless 
learning. However, a dark key was used as S~ with a 
green key as St, rather than the red and green stim- 
uli used by Terrace. This substitution of S~ stimuli 
produced errorless acquisition of the discrimination 
in many pigeons from groups in addition to those re- 
ceiving early-progressive discrimination training. 

At the completion of the experiment, all eight 
birds of the early-propressive proup were still errorless, 
using a criterion of 25 or fewer total responses to S— 
for classifying a bird as errorless. Six of the eight 
pigeons in the early-constant group were errorless, 
three of the late-progressive proup, and four of the 
late-constant group. These data clearly confirm Ter- 
race’s finding that the early introduction of $~ was 
very effective in reducing the number of errors during 
discrimination training, The progressive groups dif- 
fered from the constant groups in crrers only during 
the first five days of discrimination training when S— 
was faded in for the progressive groups. ‘The progres- 
sive groups made significantly fewer total errors dur- 
ing the first five days of discrimination training than 
did the constant groups. During the first five days of 
discrimination training, the early-progressive eroup 
emitted fewer responses to S~ than did the early- 
constant group. 

Figure 23 shows the mean daily rates of responding 
to S+ throughout the various phases for each of the 
four groups. Of primary importance was the observa- 
tion that behavioral contrast occurred in all four 
groups. ‘The increase in rate was, in general, slightly 
greater following a transition from base line to dis- 
crimination than was the subsequent decrease follow- 
ing the opposite transition. Figure 23 also reveals a 
trend for the late groups to exceed the early groups 
in the mean amount of behavioral contrast produced. 
While time of S~ introduction was a more powerful 
variable than the method of introducing S—, the con- 
stant group’s introduction of S~ tended to produce 
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more contrast than did the progressive group’s S~ 
introduction. 

In his original experiment, Terrace (1963a) ob- 
served a close relationship between the occurrence of 
errors during S~ and behavioral contrast during S+ 
and went on to argue that behavioral contrast was 
produced by nonreinforced responding during S-. 
This interpretation of behavioral contrast predicts a 
positive correlation between the number of errors dur- 
ing S~ and the magnitude of behavioral contrast dur- 
ing St. 

Figure 24 compares the number of responses dur- 
ing S~ (upper panel) with the magnitude of behav- 
ioral contrast during S+ (lower panel). The left 
panels compare the number of errors during the first 
phase of discrimination training with the amount of 
behavioral contrast exhibited during the first discrim- 
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Fig. 24. Relationships between responding during S~ (errors) 
and the magnitude of behavioral contrast produced. The bars 
within each panel are ordered with respect to the total number 
of errors produced during S*|S- 1 (the first discrimination 
phase). Arrows within each panel mark the group mean for 
that measure. The left half of this figure depicts the relation- 
ship between the number of errors made during S*|S- 1 and 
the amount of behavioral contrast observed for each pigeon. The 
right half of the figure presents the relationship between the 
total number of errors occurring during the first two phases of 
discrimination training and the mean behavioral contrast pro- 


duced overall. Spearman rank-order correlation coefficients for 
each group are indicated by the numbers appearing just below 


the error data. (From Kodera & Rilling, 1975. @ 1975 by the So- 
ciety for the Experimental Analysis of Behavior, Inc.) 


ination-training phase. The right panels relate the 
total number of errors made throughout the experi- 
ment with the mean behavioral contrast produced by 
each subject. The birds are ordered according to the 
number of errors made during the first discrimination 
phase. The group means are indicated by horizontal 
arrows. 

A direct relationship between errors and behay- 
ioral contrast within groups would be represented by 
a rank-ordering of the amount of behavioral contrast 
from least to most within each of the four conditions, 
reflecting the symmetry of the error data. Clearly, no 
such relationship is visible in the data of Figure 24. 
Pigeons that produced the fewest errors in each con- 
dition were as likely to show the greatest amount of 
contrast as were those that produced the greatest num- 
ber of errors. Spearman rank-order correlation coeffi- 
clients between the number of errors and the amount 
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of behavioral contrast are indicated in Figure 24 for 
each group. None of these correlations was significant. 

In other aspects the Kodera and Rilling experi- 
ment confirms the results obtained in Terrace’s ex- 
periment. The major discrepancy between the two 
studies is the occurrence of behavioral contrast fol- 
lowing early-progressive training in the Kodera and 
Rilling study. Another important difference is that 
many of the birds in the early-constant, late-progres- 
sive, and late-constant groups were also errorless, yet 
demonstrated behavioral contrast. Which differences 
in procedure are responsible for Terrace’s failure to 
observe behavioral contrast in the early-progressive 
group? His original experiment included a red key 
as St and a green key as S—. Behavioral contrast is 
widely assumed to be independent of whether the 
stimuli used as St and S~ lie on the same or different 
dimensions, since contrast is obtained after both inter- 
and intradimensional training. Using a separate group 
of pigeons trained on the early-progressive procedure 
of the main experiment, Kodera and Rilling substi- 
tuted a red S~ for the dark key. The mean magnitude 
of behavioral contrast was 23.2 responses per min for 
the red group as compared with 15.3 responses per 
min for the group with a dark key as S—. These re- 
sults suggest that the magnitude of behavioral contrast 
is greater following intradimensional training than 
following interdimensional training, In any event, 
Terrace’s use of a red S— did not preclude the occur- 
rence of contrast. 

A more significant difference between the two stud- 
ies was ‘Terrace’s use of 3-min components during $+ 
and S~ while Rilling and Kodera used 1]-min com- 
ponents. Using a group of pigeons for whom the dura- 
tion of S+ was 3 min, which also decreased the 
number of alternations between S+ and S~ from 25 
to 8, the magnitude of behavioral contrast following 
early-progressive training decreased to only 7.8 re- 
sponses per min and contrast was not obtained from 
each bird. The total duration of St+ remained con- 
stant in these experiments. These data suggest that 
Terrace’s failure to obtain behavioral contrast was 
due in part to the use of stimulus components with a 
duration of 3 min. This is coupled with the fact that 
Terrace’s data were subject to more random daily 
fluctuation than were those of Kodera and Rilling 
(1975), who obtained more stable response rates 
through the imposition of response stability require- 
ments before instituting changes in training condi- 
tions. What is not clear is whether the attenuation in 
the behavioral contrast was due to increased duration 
of S*+ or to the decrease in the number of alternations 
between S+ and S-. 
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Errorless Learning in Perspective 


Errorless learning is the transfer of stimulus con- 
trol from one dimension associated with S~ to an- 
other, provided the transfer is obtained with a zero 
rate of responding to the new S— from the first ses- 
sion of transfer. When the pecking response of the 
pigeon is reinforced during S+, errorless discrimina- 
tion-training procedures are effective in eliminating 
the terminal peck response from the sequence of ob- 
serving and approaching S—. Therefore, the behavior 
of the organism during S~— is modified or conditioned 
by errorless discrimination training. 

Errorless learning includes the tail with few errors 
of the distribution of total errors for the acquisition 
of a discrimination. Since the definition of an error 
response is arbitrary, it follows that the definition of 
evrorless learning is also arbitrary. Extensive research 
has demonstrated that the behavior of subjects with 
few errors is not fundamentally different from the be- 
havior of subjects with many errors, except for the 
difference in errors. 

Fscape from S~, aggression during S—, conditioned 
inhibition, inhibitory stimulus control, and behav- 
ioral contrast have all been obtained independently 
of whether the discrimination was acquired with or 
without errors. Therefore, a theoretical classification 
based on the distinction between learning with eér- 
rers and learning without crrors is not useful. 

By concentrating upon the number of errors, 
investigators have overlooked key determinants of 
the many phenomena produced by differential rein- 
forcement. The research of Rilling and his students 
demonstrates that the manner of presentation of 
conditions during discrimination learning, not the 
production of errors, determine the so-called by-prod- 
ucts of discrimination learning. 

Two procedural variables that have been identi- 
fied are: (1) the time in the subject’s experimental 
history at which S~ is introduced (e.g., early or late), 
and (2) the rapidity with which S~ is introduced (e.g., 
progressively or abruptly). The data indicate that S— 
is more aversive, as measured by escape behavior (and 
more behavioral contrast is observed during S+), 
when S~— is introduced late in training. While data 
are not yet available, the previous findings suggest 
that conditioned inhibition and inhibitory stimulus 
control are also greater when S~— is introduced late in 
training. ‘Che time of S~ introduction is a more pow- 
erful variable than the method of introducing S-. 
Abrupt introduction of S~ produces more behavioral 
contrast than gradual introduction. This analysis sug- 
gests that conditioned inhibition and inhibitory stim- 
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ulus control are also greater when S~— is introduced 
abruptly. 

Frequency of reinforcement during S+ affects be- 
havior during S~. Aggressive behavior during extinc- 
tion is induced by the withdrawal of the stimulus 
associated with reinforcement, rather than by the 
change in the consequences of the pecking response 
during extinction. The higher the frequency of rein- 
forcement during S+, the higher the rate of attack 
during S~. Since conditioned inhibition is a contrast 
phenomenon produced by the absence of reinforce- 
ment, this variable may have a similar influence on 
conditioned inhibition and inhibitory stimulus con- 
trol. 

A pigeon that has acquired a discrimination with a 
procedure in which S~ is introduced gradually and 
carly in training may show “errorless learning” and 
less aggression and more escape and behavioral con- 
trast than a pigeon that has acquired the same dis- 
crumination but with a 5~ intreduced abrupdy and 
late in training. However, the organism that has ac 
quired the discrimination without errors is retarded 
in detecting changes in the response-reinforcer yela- 
tionship in the presence of S- as compared to an 
erganism that mastered the discrimination with er- 
rors. When inhibition is measured with a resistance. 
to-reinforcement procedure, more conditioned inhibi- 
tion is obtained for the birds that acquired the 
discrimination without errors than for the birds that 
acquired the discrimination with errors. ‘Vheretfore, 
errorless learning is clearly not the best learning for 
an organism exposed to a changing environment. 


SUMMARY 


The acquisition of stimulus control in a successive 
discrimination occurs as follows. In discrimination 
training, certain stimuli predict occasions when a 
class of responses is reinforced, and other stimuli pre- 
dict occasions when those responses are not reinforced 
or when they are reinforced according to a different 
program. By definition, stimulus control requires 
different rates or patterns of responding in the pres- 
ence of each stimulus. Following the acquisition of a 
discrimination, a test is necessary to determine which 
aspects of the training stimuli have acquired control 
over responding. ‘The stimulus-generalization gradient 
is the primary index of stimulus control. Gradients 
are obtained by extinction or steady-state methods in 
which the stimuli are presented singly or simultane- 
ously. 

In intradimensional training, the interaction be- 
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tween reinforcement at S+ and extinction at S— is 
reflected in the stimulus-generalization gradient as the 
positive and negative peak shifts. These processes are 
separated in interdimensional training in which an 
excitatory gradient is obtained around S+ and an in- 
hibitory gradient is obtained around S—. Following 
interdimensional training, the normal distribution 
curve provides the best fit to empirically obtained 
gradients of excitation and inhibition. For the vari- 
ables that have been investigated to date, the deter- 
minants of the peak shift and the inhibitory gradient 
are identical. 

A functional analysis reveals a wide range of vari- 
ables that determine inhibitory stimulus control. 
Once the reinforced response is acquired, inhibitory 
control by stimuli associated with the absence of rein- 
forcement may develop rapidly in a matter of min- 
utes. Extended training has relatively little influence 
once the necessary and sufficient conditions for in- 
hibitory stimulus control have been established. In- 
hibitory stimulus control requires relatively brief suc- 
cessive alternations of the discriminative stimuli, as 
Opposed to massed presentations, and the presence 
during the generalization test of the background stim- 
uli present during training. The determinants of the 
peak shift and inhibitory stimulus control are inde- 
pendent of whether responding is positively or nega- 
tively reinforced. 

Extinction is not necessary for the acquisition of 
inhibitory stimulus control. Stimuli associated with 
less favorable schedules of reinforcement demonstrate 
some of the same inhibitory phenomena as do stimuli 
associated with extinction. Inhibitory control is a 
relative rather than an absolute property of a dis- 
criminative stimulus. 

In the many cases in which responding is rein- 
forced differently in the presence of $1 and S2, stim- 
ulus control is dependent upon the combination of 
the discriminative and incentive properties of the 
stimuli. When the response rates to S1 and S2 are 
identical, the peak shift and inhibitory stimulus con- 
trol are not obtained. The greater the index of dis- 
crimination between $1 and S2, the higher the 
probability of obtaining a peak shift. Provided a dis- 
crimination between Sl and S2 is established, the 
peak shift and inhibitory stimulus control are ob- 
tained even though the frequencies of reinforcement 
in SI and S2 are identical. Inhibitory phenomena are 
maximized by large incentive and rate differences 
between SI and S2. 

Some theories of the peak shift and inhibitory stim- 
ulus control have focused upon a single primary 
determinant such as nonreinforced responding during 
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extinction. Such theories are an inadequate oversimpli- 
fication, since an experimental analysis demonstrates 
multiple determinants of inhibitory phenomena. 
Spence’s theory of discrimination learning provides a 
simple integrated account of the peak shift and in- 
hibitory stimulus control. However, Spence’s theory 
fails to account for the effects of conditioning con- 
text, the relativity of inhibitory stimulus control, the 
effects of transitions between S- and S+, and en- 
hanced responding within a spatially defined dimen- 
sion. 

The time is ripe for the development of new quan- 
titative theories that integrate the extensive data re- 
viewed in this chapter. The extension to operant 
behavior of theories of Pavlovian differential condi- 
tioning may lead to a single integrated formulation 
of stimulus control. Many of the questions raised by 
investigators of human cognition and information 
processing are the same questions asked by investi- 
gators in stimulus control. The extent to which these 
developments are relevant to research on stimulus 
control remains to be determined. 
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Stimulus Control 


INTRODUCTION 


Experimental subjects do not respond, nor do ex- 
perimenters arrange contingencies of reinforcement, 
in a vacuum. In experiments on both classical and 
operant conditioning, the experimenter delivers rein- 
forcers only in the presence of a specific set of stim- 
uli. This is most obviously true of classical experi- 
ments, where the availability of reinforcement 1s 
always signaled by the presentation of a specific 
conditional stimulus (CS). But it is equally true of op- 
erant experiments, for operant responses are rein- 
forced only when they occur in a specific situation. At 
the least they are reinforced only in the experimental 
chamber. Often, they are reinforced only during cer- 
tain periods of an experimental session, with these 
being marked by the presentation of explicit dis- 
criminative stimuli signaling that some class of re- 
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sponses will be reinforced according to a particular 
schédulé. 

If the experimental situation or the cxperimenter's 
discriminative stimulus or CS is changed in one or 
more ways, it 18 common to observe an apparently 
correlated change in the subject’s behavior, A pigeon 
that receives food on a variable-interval schedule for 
pecking at a key illuminated with green light will 
peck at a lower rate if the color of the light is changed 
to red. A dog salivating upon every presentation of a 
1,000-Hz tone signaling the delivery of food may 
salivate less profusely if a 2,000-Hz tone is presented. 
If a change in a particular stimulus is always followed 
by a change in the probability, amplitude, latency, or 
rate of a particular response, we may say that this 
stimulus exercised some control over that response. ‘The 
term stimulus control has come to be used as a con- 
venient shorthand expression for describing such an 
observed relationship between changes in external 
stimuli and changes in recorded behavior. 

In the hands of some writers, the term stimulus 
control has been characterized as “relatively neutral” 
and as one to be preferred to “traditional concepts of 
generalization and discrimination” (Terrace, 1966, p. 
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271). It is, of course, possible to reserve the term stim- 
ulus control solely to refer to the slope of a generaliza- 
tion gradient, but it is neither clear that it is partic- 
ularly profitable to do so nor obvious that it has 
always been used in this purely descriptive sense. In 
the present chapter, at any rate, the term will not be 
used in this strictly neutral way. To say that a par- 
ticular stimulus has acquired control over a subject’s 
behavior, I shall assume, is tantamount to saying that 
this stimulus has been established as a signal for rein- 
forcement, or as a signal that a certain class of re- 
sponses will be reinforced.1 Evidence that a stimulus 
has been successfully established as a signal for rein- 
forcement may be provided in a variety of ways. One 
way is to show that changes in some features of the 
stimulus result in correlated changes in behavior. 
Another way would be to show that the removal of 
the stimulus resulted in the cessation of the subject’s 
responses. Yet other measures of control are possible: 
the rate of subsequent discrimination learning, when 
the original stimulus continues to signal reinforce- 
ment and a second stimulus signals nonreinforcement, 
would be equally acceptable measures of the control 
gained by the stimulus, 

The slope of a generalization gradient, therefore, is 
only one of several potential measures of stimulus 
control. It 1s not even necessarily the best or most 
sensitive measure. Thus if an experimenter observes a 
flat gradient of generalization when he varies some 
aspect of the training situation, he is not entitled to 
conclude that this aspect had gained no control over 
his subject’s behavior. He may say, if he wishes, that 
this aspect exerted no control over responding on this 
series of test trials, but this, of course, is no more than 
a redescription of the outcome of the data. The flat 
gradient is not necessarily evidence that this stimulus 
failed to acquire control over behavior; it may simply 
imply that the testing procedure is inadequate to 
demonstrate such control, Flat gradients are often, for 
example, consequences of ceiling or floor effects. 
Farthing and Hearst (1970) trained pigeons on a dis- 
crimination between a vertical line displayed on a 
blue background and a horizontal line on a green 
background. When given a series of nonreinforced 
test trials to the component stimuli as well as to 
various compounds, they responded at so low a rate 


1A stimulus may also gain control if it is established as a 
signal for the omission of a reinforcer or as a signal that certain 
responses will not be reinforced. The phenomena of inhibitory 
control, discussed by Rilling in chapter 15, are not specifically 
discussed here. It seems reasonable. however, to expect that 
many of the principles derivable from studies of excitatory 
stimulus control will apply equally to the case of inhibitory 
control. 
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to vertical and horizontal lines presented on black 
backgrounds that it was impossible to detect any 
evidence of significant control by the orientation of 
the line. Other tests, however, revealed a substantial 
difference in the readiness to respond to vertical and 
horizontal lines (see also Zentall, 1972). 

Conversely, a subject with a very high probability 
of responding in the training situation may continue 
to respond on all test trials, and thus show a flat 
gradient of generalization because of a ceiling effect. 
This may happen only rarely in experiments which 
study rate of key pecking in pigeons, for rate of re- 
sponding is a relatively unbounded measure of 
response strength. Where more bounded measures are 
used, such as the proportion of trials on which a re- 
sponse occurs, or measures of conditioned suppression 
in aversive conditioning, there is clear evidence that 
ceiling effects may obscure the control actually gained 
by a stimulus, which is only displayed during the 
course of an extended series of test trials (e.g., Gray & 
Mackintosh, 1973; Hoffman & Fleshler, 1961). 

A second reason why the control gained by one 
stimulus may not be revealed in a generalization test 
is that other aspects of the experimental situation may 
have gained even stronger control over responding. If 
these other stimuli remain unchanged during the 
course of testing, they may maintain a constant rate 
of responding and thus “mask” the control actually 
acquired by the stimulus varied during testing. The 
concept of masking is one that will play a central role 
in later sections of this chapter. For the present, it 
will be sufficient to provide a brief example. Newman 
and Baron (1965) trained pigeons to peck a response 
key illuminated with a white vertical line on a green 
background. When given a generalization test to other 
orientations of the line, still shown on a green back- 
ground, the pigeons responded at a relatively constant 
rate to orientations of the line as far as 45° on either 
side of vertical. Several later studies, however, while 
confirming this finding, have shown that under condi- 
tions where a significant level of responding can be 
maintained in the absence of the colored background, 
a reliably sloping gradient of generalization can be 
observed when different orientations of the line are 
shown on a black background (Freeman & Thomas, 
1967; Newman & Benefield, 1968; Thomas, Svinicki, & 
Svinicki, 1970). 

The flat gradient in the first case reflects more 
about the control over responding acquired by the 
colored background than about the lack of control 
acquired by the line. A particular feature of the rein- 
forced stimulus display may indeed acquire control 
over responding, but this control may be masked dur- 
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ing a subsequent test, because the unchanged presence 
of another set of features insures a uniform rate of 
responding on all test trials. 

The behavior of a subject in a generalization test, 
therefore, does not necessarily provide a simple or 
direct measure of the control acquired by the stim- 
ulus during earlier training. This point has a number 
of important implications. It may not always be easy, 
for example, to determine whether differences in the 
slope of a generalization gradient following different 
experimental treatments reflect differences in the con- 
trol acquired as a consequence of those treatments, or 
the effects of those treatments on the host of other 
variables that may affect test performance. If different 
treatments produce substantially different rates of re- 
sponding, differences in test performance may reflect 
ceiling or floor effects influencing one gradient more 
than the other. If different treatments produce sub- 
stantial differences in resistance to extinction, then, 
since it is known that gradients become progressively 
steeper during the course of extinction (Hoffman & 
Fleshler, 1961: Jenkins & Harrison, 1960: Thomas & 
Barker, 1964), their effect on generalization gradients 
may be simply attributed to this factor rather than 
to their effect on stimulus control per se. 

Examples of one or more of these possibilities will 
recur in what follows. It will be important to remem- 
ber that generalization gradients provide but one of 
several methods of measuring stimulus control and 
that the assessment of stimulus control is necessarily 
an indiréct affair. We do not obsérve stimulus control 
in the data of a generalization test. We may infer that 
a stimulus has acquired control over a subject’s be- 
havior by noting a correlation between changes in 
stimuli and changes in responding. But the inference 
is not always easy. 


CONDITIONS AFFECTING THE 
ESTABLISHMENT OF STIMULUS CONTROL 


The study of stimulus control has for a long time 
centered around the question of the sufficient and 
necessary conditions responsible for the observation of 
a sloping gradient of generalization when some 
feature of the training situation is varied. As Terrace 
(1966) noted, one aspect of this issue was the objection 
raised by Lashley and Wade (1946) against what they 
called a “Pavlovian theory of generalization.’”’ Lash- 
ley and Wade argued that Hull and Spence had fol- 
lowed Pavlov in supposing that the reinforcement of a 
response in the presence of a particular stimulus was 
sufficient to establish a center of excitation, and that 
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this excitation would spread to other stimuli in pro- 
portion to their similarity to the training stimulus. 
Sloping gradients were an automatic consequence of 
reinforcement in the presence of a particular stimulus. 
Although this view can be seriously attributed 
neither to Hull nor to Spence, it is fair to acknowl- 
edge that Lashley and Wade’s paper, by stressing the 
point that sloping gradients might not be an auto- 
matic consequence of the delivery of reinforcement in 
the presence of a particular stimulus, did provide 
valuable impetus in initiating research designed to 
uncover the conditions necessary and sufficient for the 
establishment and demonstration of control by any set 
of stimuli. 


Intrinsic Diffarances in the Salianca of Stimuli 


These are important questions, Perhaps even more 
important, however, is the realization that they may 
not admit of any one. general answer. There is no 
reason to suppose that a single set of conditions, 
sufhdient and nécéssary for the establishment of stim- 
ulus control in ene case, will held fer all stimuli, re- 
sponses, reinforcers. or subjects. The control exercised 
by a particular stimulus over the behavior of a par- 
ticular subject will obviously depend on that gubject’s 
sensory apparatus; pigeons are more likely than rats, 
for example, to be controlled by the wavelength of a 
discriminative stimulus. Equally, the amount of trains 
ing required to establish control by q particular ctim- 
ulus will surely vary from stimulus to stimulus, and 
in a situation where a number of different stimuli are 
equally correlated with reinforcement, some will ac- 
quire greater control over responding than others, 
These are not surprising observations. No one would 
deny that some stimuli appear to be more effective for 
some subjects than are others. It may even be useful 
to characterize such differences as consequences of 
differences in the “salience” of particular stimuli to 
particular subjects, provided that the circular nature 
of the definition is appreciated. There are, however, 
less obvious, and therefore rather more interesting, 
constraints on the generality of possible answers 
(Shettleworth, 1972). 


Nature of Response and Reinforcer 


Dobrzecka, Szwejkowska, and Konorski (1966) 
have shown that the features of an auditory stimulus 
which come to control the responses of a dog may de- 
pend upon the nature of the response required by the 
experimenter. Dogs were placed in a stand and ex- 
posed to two discriminative stimuli, a metronome in 
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METRONOME AHEAD RIGHT LEG METRONOME AHEAD. GO 


BUZZER BEHIND . 


LEFT LEG BUZZER BEHIND: NO GO 


PROPORTION OF GO RESPONSES 


PROPORTION OF RIGHT LEG RESPONSES 


METRONOME BUZZER 


METRONOME BUZZER 
BEHIND AHEAD BEHIND AHEAD 


Fig. 1. Results of experiment by Dobrzecka et al. (1966). In the 
left panel are shown the results from dogs trained to respond 
with different legs to qualitatively different auditory stimuli, 
located in different positions. When tested with the stimuli in 
reversed positions, they reverse their responses. The right panel 
shows the results from dogs trained either to raise or not to 
raise their right leg. When tested with the position of metro- 
nome and buzzer reversed, they continue to respond as before. 


front and a buzzer sounded from behind. One group 
was required to raise their right foreleg in response to 
one stimulus and their left foreleg in response to the 
other; a second group was trained on a go-no go dis- 
crimination, being required to raise their right foreleg 
to one stimulus and to refrain from responding in the 
presence of the other. Subjects were then tested by 
reversing the positions of the metronome and buzzer. 
The results of these test trials are shown in Figure 1. 
It can be seen that if animals were required to learn 
which foot to raise, the location of the signal had ac- 
quired control over responding, while animals re- 
quired to learn the go-no go discrimination had 
learned to respond to the metronome and not respond 
to the buzzer, regardless of the position from which 
they were sounded. 

Thus although the dogs were perfectly well able to 
discriminate both between the locations of the buzzer 
and of the metronome, and between the quality of the 
sound produced by each source, responding was con- 
trolled in one case only by the qualitative difference, 
and in the other case only by the difference in loca- 
tion. The selective control observed cannot be attrib- 
uted to differences in the salience of the two cues, if 
this is understood to refer to the physical character- 
istics of a stimulus and to the subject’s sensory capac- 
ities. 

The work of Garcia, Revusky, Rozin, and others 
has suggested that when a particular reinforcing event 
follows the ingestion of food, the feature of the food 
which will be established as a signal for reinforce- 
ment will depend on the nature of the reinforcer 
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(Garcia & Ervin, 1968; Revusky & Garcia, 1970; Rozin 
& Kalat, 1971). Rats appear to associate the flavor of 
food or water with subsequent poisoning, and the 
visual or other external features accompanying its in- 
gestion with a reinforcer such as electric shock. In a 
study that is, in effect, rather similar in design to that 
of Dobrzecka et al., Garcia and Koelling (1966) gave 
rats the opportunity to drink water having a partic- 
ular flavor and whose ingestion was accompanied by 
a particular set of visual and auditory stimuli. One 
group received an electric shock, either immediately 
or after a delay, contingent on drinking this water; a 
second group was made sick either by an injection of 
lithium chloride or by X-irradiation. The results of a 
series of test trials, in which the flavor and the visual 
and auditory cues were separately presented, are 
shown in Figure 2. The shocked animals showed a 
marked reduction in their consumption of water ac- 
companied by these visual and auditory stimuli, but 
no aversion to the specific flavor used in training; 
poisoned rats, on the other hand, showed an aversion 
to the flavor of the water they had been exposed to, 
but none to the visual and auditory stimuli accom- 
panying its ingestion. 

The specific features of food or drink that are 
associated with subsequent poisoning may differ from 
one group of animals to another. Predatory birds, for 
example, associate the visual characteristics of their 
prey with its unpalatable taste (Brower, 1969); this is, 
of course, a feature of their behavior responsible for 
the evolution of visual Batesian mimicry among prey 
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Fig. 2. Rate of drinking during test trials when rats have been 
punished for drinking water with a particular flavor and whose 
ingestion was accompanied by a particular set of audiovisual 
stimuli. The flavor of the water was established as the effective 
signal when induced sickness was the aversive reinforcer, but 
the audiovisual stimuli became the effective signal if electric 
shock was the aversive reinforcer. (After Garcia & Koelling, 1966.) 
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species. Other birds have been shown to associate the 
visual characteristics of water, rather than its taste, 
with subsequent poisoning (Wilcoxon, Dragoin, & 
Kral, 1971). How far these differences between differ- 
ent species may be attributed merely to differences in 
sensory capacity is not at present certain, ‘The impor- 
tant point is that in Garcia and Koelling’s experi- 
ment, as in that of Dobrzecka et al., one component of 
a compound stimulus acquired control over behavior 
under one condition, while the other component ac- 
quired control under a second condition. 

Garcia’s findings reflect an apparent dependency be- 
tween specific stimuli and specific reinforcers, but they 
should not necessarily be regarded simply as a conse- 
quence of the specialized nature of the system regu- 
lating food intake. Foree and LoLordo (1973), for 
example, found that the clements of a compound 
stimulus that gained control over responding in 
pigeons depended on whether responding was being 
reinforced by the presentation of food or by the avoid- 
ance of electric shock, On a series of discrete trials, 
signaled by the combined illumination of a red light 
and the presentation of a 440-Hz tone, pigeons were 
trained to press a treadle either to obtain food or to 
avoid shock. When tested with these components in 
isolation, subjects reinforced with food tended to re- 
spond only in the presence of the light, while those 
reinforced by the avoidance of shock responded more 
to the auditory than to the visual component. At the 
very least, these results suggest that the well-known 
difficulty of establishing control over food-reinforced 
key pecking in pigeons by an auditory stimulus (see 
below) cannot be entirely attributed to defects in the 
birds’ sensory system, 

Garcia and Ervin (1968) and Rozin and Kalat 
(1971) have argued that the rat’s readiness to associate 
flavors with illness is a prime example of an adaptive 
specialization in learning. One can hardly doubt that 
such learning is of adaptive significance in the life of 
a relatively omnivorous animal, but to point to the 
adaptive significance of a particular characteristic 1s 
not the same as specifying the causal factors respon- 
sible for its appearance in any particular individual. 
There may be genetically determined constraints on 
the probability that different stimuli will become sig- 
nals for different reinforcers. It is also possible, how- 
ever, that a subject’s prior experience of correlations 
between events in its environment may affect the 
probability of certain events being established as 
signals for others. It is possible, for example, that an 
adult rat has had a lifetime’s experience in which 
gastric changes have been correlated with changes in 
recently experienced tastes, but uncorrelated with 
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changes in auditory or visual stimuli, or that an adult 
pigeon has an extensive experience of correlations be- 
tween changes in visual stimuli and changes in the 
probability of food, but no correlation between audi- 
tory changes and the availability of food. Before 
accepting Garcia and Rozin’s inference, therefore, it 
is important to see how far prior experience can affect 
the establishment of stimulus control. 


Prior Experience 


Although numerous experimenters have exposed 
animals to a variety of different experiences with par- 
ticular sets of stimuli, before training them to respond 
to those, or similar, stimuli, they have only rarely been 
concerned to provide answers to the questions raised 
by the results discussed in the preceding section. It 
will be necessary to survey rapidly some of the other 
questions posed, and answers provided, before return: 
ing to this issue. 


EARLY EXPERIENGE: LASHLEY AND 
WaADE’S HYPOTHESIS 


One of Lashley and Wade’s theses was that a 
change in some aspect of the training situation would 
produce a correlated change in the subject’s behavior 
only if that subject had some prior experience of 
variation along this stimulus dimension. “The ‘dimen: 
sions’ of a stimulus,” they wrote, “are determined by 
comparison of twe or more stimuli and de net exist 
for the organism until established by differential 
training” (Lashley & Wade, 1946, p. 74). ‘The implica- 
tion that has been investigated in a number of studies 
is that animals deprived of all experience of variations 
along a particular stimulus diménsion by réstricted 
conditions of rearing will produce flat gradients of 
generalization along that dimension. 

Some results reported by Peterson (1962) appeared 
to provide some initial support for this suggestion. 
Peterson trained two groups of young ducklings to 
peck a key illuminated with sodium light of 589 nm 
and then tested them for generalization to other wave- 
lengths. ‘Two birds reared in normal illumination 
showed orderly and sloping wavelength gradients with 
a peak at 589 nm. Four other birds, however, had 
been reared in individual cages diffusely illuminated 
by sodium light, and had thus received little or no 
experience of variations in wavelength before the 
generalization test. All four birds showed flat wave- 
length gradients. Peterson’s results have been thought 
to imply that “‘a necessary condition for obtaining a 
generalization gradient of wavelength whose slope is 
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greater than zero is prior exposure to white light. 
White light presumably allowed differential reinforce- 
ment with respect to different wavelengths to occur 
prior to the generalization test’ (Terrace, 1966, p. 
279): 

As I shall argue below, it is questionable whether 
this conclusion necessarily follows from Peterson’s 
results. For the moment, however, it will be enough to 
question the reliability and generality of those results. 
Tracy (1970) has attempted to replicate Peterson’s 
experiment with a larger number of subjects, but with 
at best only partial success. He did, indeed, find that 
birds reared in monochromatic sodium light showed 
somewhat flatter gradients than those reared in 
normal illumination, but the main results of his 
experiment, shown in Figure 3, leave no doubt that 
their responding was moderately well controlled by 
changes in wavelength. Tracy further showed that the 
difference between the gradients of these two groups 
may have been partly a consequence of the effect of 
rearing on preferences for different wavelengths. The 
right-hand panel of Figure 3 shows wavelength gra- 
dients of control and experimental subjects following 
reinforcement in the presence of a white vertical line 
on a black key. It can be seen that even without any 
prior reinforcement in the presence of 589 nm, the 
sodium-reared birds showed a greater preference for 
shorter wavelengths than did the controls. It is in this 
region of the spectrum, as the left-hand panel of 
Figure 3 shows, that their gradients were flatter than 
those of controls following wavelength training. 

Ganz and Riesen (1962), in a study with monkeys 
published at about the same time as Peterson’s experi- 
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Fig. 3. Relative gradients of wavelength generalization in duck- 
lings reared either in white light or in monochromatic sodium 
light (589 nm). The left panel shows gradients obtained after 
reinforcement for responding to 589 nm. The right panel shows 
gradients after reinforcement for responding to a vertical white 
line. (After Tracy, 1970. © 1970 by the Society for the Experi- 
mental Analysis of Behavior, Inc.) 
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ment, presented data suggesting that monkeys reared 
in complete darkness might initially generalize a 
trained response relatively completely to wavelengths 
other than that to which they were trained to respond. 
Dark-reared and control subjects were trained to press 
a response key for sucrose solution while exposed to 
monochromatic light projected to their right eye. 
When tested with other wavelengths, interspersed 
with reinforced retraining trials with the original 
stimulus, control subjects showed an orderly, sloping 
gradient from the first day of testing, while those 
reared in the dark produced an initially flat gradient, 
which gradually became steeper during the course of 
successive test sessions. The results of the first test 
session, therefore, suggest that dark rearing may 
initially flatten generalization gradients in monkeys. 

Several later experiments, however, employing 
other subjects, such as chicks or Japanese quail, have 
completely failed to substantiate Ganz and Riesen’s 
or Peterson’s results. Neither rearing in monochro- 
matic light nor rearing in total darkness has been 
found to have any significant effect on the slope of 
wavelength gradients in these subjects (Malott, 1968; 
Rudolph & Honig, 1972; Rudolph, Honig, & Gerry, 
1969). 

The conclusion must be, then, that prior experi- 
ence of variation along a particular stimulus dimen- 
sion 1s not a necessary condition for the establishment 
of control by a stimulus falling on that dimension. 
Extended rearing under restricted conditions, of 
course, might interfere with the normal development 
of the neural mechanisms underlying stimulus anal- 
ysis, and it is possible that such an effect might be 
more pronounced in primates than in birds. But as a 
general rule, it is clear that artificial restrictions on a 
subject’s prior experience do not necessarily disrupt 
the normal development of simple perceptual anal- 
ysis, and do not prevent the normal establishment of 
stimulus control under appropriate training con- 
ditions. 


NONREINFORCED EXPOSURE TO A SINGLE 
STIMULUS: LATENT INHIBITION 


In ‘Tracy’s (1970) experiment with ducklings, there 
was some suggestion that monochromatic rearing 
might have had some tendency to flatten the gradient 
of wavelength generalization. Although this appears 
to have been largely due to a change in the uncondi- 
tional preference for different wavelengths, there is 
another factor that might have been responsible for 
such an effect. Numerous studies have now established 
that repeated exposure to a particular stimulus in the 
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absence of reinforcement may significantly impair the 
subsequent establishment of control by that stimulus 
when it is presented either as a CS or as a discrimina- 
tive stimulus. 

This finding was first clearly reported by Lubow 
and Moore (1959), who termed the effect “latent in- 
hibition.” In their experiment, goats and sheep re- 
ceived 10 nonreinforced presentations of a stimulus 
that later served as a CS signaling shock to the foreleg. 
These animals acquired a conditioned flexion re- 
sponse significantly more slowly than control subjects 
who received no preexposure to the CS, Lubow and 
Moore suggested that this interference with the ac 
quisition of a conditioned flexion response might have 
been a consequence of the establishment of some in- 
compatible response during preexposure. A  subse- 
quent experiment provided little support for this sug- 
gestion (Lubow, 1965), and it has béen convincingly 
disproved by the results of several later experiments 
(Halgren, 1974; Reiss & Wagner, 1972; Rescorla, 
1971). In all of these studies, nonreinforced preex: 
posure to a stimulus interfered with both excitatory 
and inhibitory conditioning to that stimulus. In Res- 
corla’s experiment, for example, rats were giyen non- 
reinforced presxposure to a tone before the start of 
conditioned emotional response (CER) conditioning. 
When the tone sionaled shock, preexposed subjects 
showéd poorer conditioning than controls; but if a 
light was used to signal shock, and a tone-light com- 
pound to signal the omission of shock, preexposed 
animals not only learned to suppress to the light as 
rapidly as controls, they also continued to suppress to 
the tone-light compound longer than the controls. 

The implication is that nonreinforced exposure to 
a stimulus may interfere with the establishment of 
that stimulus as a signal either for reinforcement or 
for the omission of reinforcement. If the effect of such 
exposure were simply to condition a response incom- 
patible with the required conditioned response (GR), 
it should obviously facilitate. rather than interfere 
with, the development of conditioned inhibition. 
Latent inhibition is, moreover, a quite general phe- 
nomenon, having been observed in a variety of class- 
ical conditioning preparations (Lubow, 1973), as well 
as in studies of operant discrimination learning 
(Hearst, 1972; Mellgren & Ost, 1969). It is certain, 
therefore, that prior experience with a particular 
stimulus will significantly affect the establishment of 
control by that stimulus—although the effect is not 
that which Lashley and Wade would have predicted. 
Exposure to a particular set of stimuli, so far from 
beig a necessary condition for the establishment of 
control by those stimuli, may interfere with the 
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establishment of control. Monochromatic rearing, as 
in Peterson’s (1962) and Tracy’s (1970) experiments, 
may reduce the slope of gradients of wavelength gen- 
eralization, not because it abolishes the experience of 
variations in wavelength, but because it insures that 
the wavelength used in training is not readily estab- 
lished as a signal for reinforcement. 


ExrosureE To Various CORRELATIONS 
BETWEEN STIMULI AND 
REINFORCEMENT 


In experiments stimulated by Lashley and Wade’s 
assumptions about the importance of early experience. 
as in studies of latent inhibition, exposure to a par- 
ticular stimulus ig scheduled without any correlated 
exposure to reinforcement, ‘Theré is, howévér, another 
group of studies which has systematically analyzed the 
cllccts of exposure to different correlations between 
stimuli and reinforcers on the subsequent acquisition 
of control by those stimuli. Among the earliest and 
best known of such experiments ave those of Lawrence 
(1949, 1950) on the acquired distinctiveness of cuss, in 
which wansfer between simultaneous and successive 
discriminations in the rat was shown to depend on the 
relationship between the relevant stimuli of the two 
problems. Later studies of intradiménsional and 
extracdimensional shifts, in which, having learned one 
problem, animals are shifted to a second, where the 
relevant stimuli sare either fram the same dimension 
as, or from a dillerent dimension from, those relevant 
in the first problem, have confirmed that experience 
of a correlation between a particular set of stimuli 
and reinforcement will selectively increase the prob- 
ability that similar stimuli will subsequently gain con- 
irol ever responding in a new situation (Shepp & 
Eimas, 1964; Shepp & Schrier, 1969). 

An experiment by ‘Thomias, Mairinér, and Sherry 
(1969) suggests that the principle of acquired distinc: 
tiveness may be used to counteract the difheulty of 
establishing auditory contrel over food-reinforced key 
pecking in pigeons. They confirmed the finding, first 
reported by Jenkins and Harrison (1960), that non- 
differential reinforcement of a pigeon’s key pecks in 
the presence of a 1,000-Hz tone would result in essen- 
tially flat gradients of generalization when the fre- 
quency of the tone was varied between 300 and 3,500 
Hz. For 100 days before the start of key-peck training, 
however, a second group of pigeons received their 
daily ration of food in their home cages always 
signaled by a 1,000-Hz tone. All birds in this group 
showed a steep and orderly gradient of generalization 
along the auditory frequency dimension, with a peak 
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of responses at 1,000 Hz. It would be of considerable 
interest to see whether similar results could be ob- 
tained if the auditory stimuli established as signals 
for food during preexperimental treatment were not 
exactly the same as the tone presented during key- 
peck training. This would suggest that the results of 
Thomas et al. represented a general change in the 
probability of auditory stimuli gaining control over 
food-reinforced behavior. 

Foree and LoLordo (1973), it will be recalled, found 
that although a visual stimulus was more likely than 
an auditory stimulus to gain control over food-rein- 
forced responding in pigeons, this ordering was re- 
versed when subjects were required to make the same 
response in order to avoid shock. Is it possible that 
this differential sensitivity to visual stimuli as signals 
for food, and to auditory stimuli as signals for shock, 
is related to the normal early experience of the 
pigeon via a process similar to that observed in experi- 
ments on acquired distinctiveness? For this to be true, 
it would be necessary to assume that experience of a 
correlation between a particular class of stimuli and 
a particular class of reinforcer would selectively alter 
the distinctiveness of those stimuli as signals for those 
reinforcers. It is possible that in the pigeon’s normal 
experience the availability of food is more reliably 
signaled by visual than by auditory stimuli (as noted 
by Jenkins & Harrison, 1960), but if this is to help 
explain Foree and LoLordo’s results, it must be as- 
sumed that this enhances the distinctiveness of visual 
Stimuli as sionals for food, but not as sionals for shock. 

There is, in fact, some evidence of precisely such a 
reinforcer-specific change in the distinctiveness of 
particular stimuli. Mackintosh (1973) found that if 
rats were exposed to uncorrelated presentations of a 
tone and shock, subsequent conditioning between 
tone and shock was severely retarded, although the 
tone could be rapidly established as a signal for water. 
Conversely, exposure to uncorrelated presentations of 
tone and water retarded subsequent tone-water con- 
ditioning, without having a comparable effect on 
tone-shock conditioning. Thus a stimulus that has in 
the past signaled no change in the probability of one 
reinforcer will be established as a signal for that rein- 
forcer only with difficulty, but may readily serve as a 
signal for another reinforcer. 


CONCLUSIONS 


There is, then, evidence that prior exposure to a 
particular correlation between a stimulus and a rein- 
forcer may affect the control over responding acquired 
by that stimulus during subsequent experimental 


STIMULUS CONTROL: ATTENTIONAL FACTORS 


training. Exposure to a positive correlation between a 
stimulus and reinforcer may increase the control 
gained by that stimulus; unreinforced presentations 
of a stimulus or exposure to uncorrelated presenta- 
tions of a stimulus and reinforcer may decrease the 
control gained by that stimulus when subsequently 
paired with reinforcement. There is, however, little 
reason to accept Lashley and Wade’s contention that 
prior exposure to a set of stimuli, in and of itself, 
without regard to the relationship between those 
stimuli and reinforcement experienced during such 
treatment, is a particularly important determinant of 
generalization gradients. There is certainly no evi- 
dence to support the view that prior exposure to 
variations along a stimulus dimension is a necessary 
prerequisite for the establishment of control by that 
dimension. While the ability of a stimulus to acquire 
control over a subject’s behavior depends on that 
subject’s past experience, therefore, there is no reason 
to suppose that the perception of stimulus relations is 
always dependent on exposure to variations in that 
stimulus. For at least some subjects and some stimulus 
dimensions, the perceptual system is already organized 
to respond differentially and in an orderly manner to 
variations along that dimension. This is not to say 
that the dimensions of stimuli to which animals re- 
spond correspond exactly to the physical dimensions, 
such as wavelength, visual intensity, or auditory fre- 
quency, which are manipulated by experimenters. It is 
obvious that we know very little about the dimensions 
along which animals are capable of classifying their 
environment, 


EXPERIMENTAL PROCEDURES: 
NONDIFFERENTIAL REINFORCEMENT 
AND DISCRIMINATION TRAINING 


Much of the experimental analysis of stimulus 
control in operant experiments has consisted of at- 
tempts to specify the training procedures required to 
establish control by particular stimuli. The volume 
of research conducted is testimony to the conclusion 
that no single set of conditions appears sufficient and 
necessary for all stimuli, subjects, or experimental 
situations. Experimental conditions apparently suff- 
cient to establish control by visual stimuli over a 
pigeon’s food-reinforced behavior, as we have seen, 
are not sufficient to establish auditory control over 
this behavior. Many investigators have ignored the 
possible contribution of differences in salience or past 
experience, and have attempted to show that these 
differences in outcome are more apparent than real. 


ee 
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If, it is argued, visual and auditory stimuli gain con- 
trol at different rates, this is because the effective 
schedules of reinforcement associated with such 
stimuli are not the same. One suggestion, as we shall 
see below, is that a localized visual stimulus will gain 
control where a diffuse auditory stimulus will not, be- 
cause the probability of the subject’s being stimulated 
by the visual stimulus will be correlated with the 
occurrence of responding and therefore with the prob- 
ability of reinforcement, while the auditory stimulus 
will impinge on the subject whether or not he is 
responding. 


Nondifferantial Rainforazamant 


Jenkins and Harrisen’s (1960) cxpcriment estab- 
lished beyond question that the nondifferential rein- 
forcement of a subject’s responses in the presence of a 
particular stimulus was not always sufficient to insure 
that changes in that stimulus would result in any 
change in the subject’s behavior. This observation has 
been taken by some as their point of departure. Ter- 
race (1966), for example, has argued that nondiffer- 
ential reinforcement is never sufficiént to éstablish 
stimulus control and that apparent exceptions te this 
rule are always cases where, inadvertently or implicitly, 
differential reinforcement was in fact scheduled. Al- 
though this position turns out to be rather difficult to 
discredit, I shall argue that it is in fact wrong and 
that even if we ignore differences in past experience, 
the most important cause of differences in stimulus 
control is not any difference in the opportunity for 
differential reinforcement, but a difference in the ex- 


tent to which such control is masked by the presence 
of other stimuli. 


Tue Hvporuesirs or IMpruicrr 
Dir FERENTIAL REINFORCEMENT 


One might have thought that there would be nu- 
merous examples of sloping gradients of generalization 
obtained without the necessity of programming differ- 
ential reinforcement by discrimination training. Pav- 
lov (1927) reported several differences in responding 
to training and test stimuli in experiments on salivary 
conditioning in dogs and also observed systematic 
changes in rate of salivation to test stimuli progres- 
sively less similar to the training stimulus. Subse- 
quent experiments have reported reliably sloping 
gradients along such dimensions as auditory frequency 
after classical conditioning in pigeons (Hoffman & 
Fleshler, 1961) and rabbits (Moore, 1972). A classical 
conditioning experiment, however, necessarily in- 
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volves differential reinforcement between the presence 
and absence of the CS. The subject is, in effect, 
trained on a discrimination between the experimental 
situation alone, signaling nonreinforcement, and the 
experimental situation plus CS, sionaline reinforce- 
ment. Thus differential reinforcement correlated with 
the presence and absence of the CS may be responsible 
for the sloping gradient observed when some aspect 
of the CS is varied. 

Experiments on instrumental learning do not, it 
may be thought, necessarily involve any such differ- 
ential reinforcement correlated with the presence and 
absence of a discriminative stimulus. The subject may 
be placed in the apparatus and responding may be 
reinforced in the continuous presence of some specific 
stimulus, which may then be varied in order to test 
for generalization. Certainly, the classic study of Cutt- 
man and Kalish (1956) on wavelength generalization 
in pigeons at first sight seems te approximate te this 
description. Birds were reinforced for pecking a key 
illuminated with a light of a cingle wavelength and 
were then tested with a series of new wavelengths. It 
15, however, not difficult to point to at least two 
possible sources of differential reinforcement implicit 
in Guttman and Kalish’s procedure (Heinemann & 
Rudoiph, 1963; ‘Lerrace, 1966). First, they pro- 
grammed bricf, 10-sec intertrial intervals during 
which the key was dark and the schedule of reinforce- 
ment not in affect. Whether or not the birds re. 
sponded during these blackouts, it remains true that 
the illumination of the key with light of a given wave: 
length, during which food was available, was con- 
trasted with the absence of illumination, when food 
was not available. Cuttman and Kalish’s use of a 
stimulus localized on the pigcen’s response key may 
have permitted a second source of implicit differential 
reinforcement. Sincé reinlorcément was contiipénit off 
pecking the key, it follows that at the moment of rein- 
forcement subjects must always have just pecked the 
key, and thercfore bcen cxposed to the wavelength 
projected onto the key. At times when they were not 
pecking, and therefore not exposed (or not so closely 
exposed) to this wavelensth, reinforcement was never 
delivered. Implicitly, therefore, reinforcement may 
have been correlated with variations in the subjects’ 
exposure to wavelength. 

The first of these suggestions can definitely be ruled 
out. Although the use of a blackout between stimulus 
presentations may have some effect on the slope of 
generalization gradients, it is not a necessary condi- 
tion for the establishment of reliable stimulus control. 
Thomas, Svinicki, and Svinicki (1970) and Thomas, 
Ernst, and Andry (1971), for example, observed rela- 
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tively steep gradients along the dimension of line 
orientation after pigeons had been reinforced for 
pecking a key continually illuminated with a vertical 
line. 

The second suggestions seem intuitively plausible, 
but although there is some evidence that the localiza- 
tion of a discriminative stimulus on the subjects’ 
manipulandum may sharpen generalization gradients, 
such results are open to alternative explanations, and 
there is ample evidence that such localization is not 
necessary. Heinemann and Rudolph studied bright- 
ness generalization in pigeons after training them to 
peck a key of a particular brightness. Under ordinary 
conditions, they observed relatively steep gradients: 
but when the entire front wall of the pigeons’ cham- 
ber was made equal in brightness to the response key, 
the brightness gradient was essentially flat. Heinemann 
and Rudolph attributed this outcome to a reduction 
in the opportunity for differential reinforcement, 
since, they argued, during training subjects would 
have been exposed to a stimulus of the same bright- 
ness as the response key even when not pecking. It is 
equally possible, however, that it is a consequence of 
the relative indiscriminability of large areas of bright- 
ness; in the absence of any contrast, changes in bright- 
ness of the entire front wall of the chamber during 
testing may have been difficult for subjects to detect. 

It is, at any rate, quite certainly possible to observe 
sloping gradients correlated with changes in relatively 
diffuse stimuli not localized on any response key. 
Hearst (1962) trained monkeys to press a lever for 
food in the presence of a continuous overhead light. 
Subsequent variations in the intensity of the light re- 
sulted in reliably sloping gradients. Mrs. V. Rege, 
working in my laboratory, has trained pigeons to peck 
an unilluminated key in the presence of an overhead, 
continuously illuminated red or blue light. Subse- 
quent generalization tests in extinction revealed reli- 
ably sloping gradients when the color of the overhead 
light was changed. Finally, Rudolph and Van Houten, 
in an unpublished study, have shown that it is possi- 
ble to obtain reliable control by a diffuse auditory 
stimulus in pigeons without explicit discrimination 
training. They trained pigeons to peck a key illu- 
minated with white light and, once pecking was estab- 
lished, gradually faded out the illumination of the 
key until the pigeons were pecking in the dark. Under 
these circumstances, a 1,000-Hz tone, continuously 
present, could be shown to exert significant control 
over responding: they observed reliably sloping gra- 
dients of generalization when the frequency of the tone 
was varied over a series of test trials. This study has a 
number of important implications which will be dis- 
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cussed later, For the present, it provides another in- 
stance of good control by a stimulus not located on 
the subject’s manipulandum. 

The burden of these experiments seems quite clear. 
The localization of a stimulus on or near the subjects’ 
manipulandum is not a necessary prerequisite for that 
stimulus to gain control over responding in the ab- 
sence of explicit differential reinforcement. The 
theorist who wishes to maintain that implicit differ- 
ential reinforcement is necessary for the establishment 
of stimulus control must therefore fall back onto a 
new line of argument. This may not, of course, be 
impossible. One might still argue that even when the 
discriminative stimulus is apparently quite diffuse, 
subjects are still implicitly exposed to differential 
reinforcement for responding in its presence. If a 
pigeon is required to peck a key in the presence of an 
overhead light or a tone, for example, the precise 
stimuli to which it is exposed will change while it is 
executing a response. Thus it could be argued that 
the stimuli impinging on the subject at the moment 
of pecking will differ, in some subtle ways, from those 
to which it is exposed when not responding: the 
presence of standing waves might cause a discrim- 
inable change in the intensity of an auditory stimulus 
at the moment of responding. 

One may wonder whether this possibility is sus- 
ceptible of disproof. Moreover, the claim that the 
establishment of stimulus control requires differen- 
tial reinforcement within the experimental situation 
must not only resort to a considerable amount of spe- 
cial pleading; if pressed too far, it also seems headed 
toward some logical inconsistency. The implication is 
that in the absence of such differential reinforcement, 
no stimuli from the experimental situation would 
ever gain control over responding. And yet the mere 
fact that subjects are reinforced in the experimental 
situation means that differential reinforcement is 
programmed between that situation and their home 
cage. ‘Thus if no feature of the experimental situation 
could be shown to have acquired control over re- 
sponding, this would show that differential reinforce- 
ment was certainly not sufficient to establish control. 
It may, of course, be impossible to predict which par- 
ticular feature or features will acquire control, and 
there is obviously no guarantee that the controlling 
stimuli will include those which the experimenter 
Chooses to vary in a subsequent generalization test. 
But if differential reinforcement is necessarily pro- 
grammed whenever an animal receives its daily ra- 
tion of food in the experimental situation, and not 
outside, then it is surely implausible to suppose that 
further differential reinforcement within the experi- 
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mental situation is necessary for the establishment of 
stimulus control. The difference in schedule of rein- 
forcement between the experimental situation and the 
home cage should presumably be sufficient to establish 
control over responding by some features of the ap- 
paratus. 


MASKING 


Since instrumental responses are typically rein- 
forced only in a particular situation, some aspects of 
that situation should gain control over responding. 
Since evidence of control over a pigeon’s food-rein- 
forced responses by such stimuli as tones is often 
hard to come by, it remains to consider why non- 
differential reinforcement within the experimental 
situation is not sufhcient to establish control by all 
features of the situation. 

The simplest solution to this problem is surely that 
proposed by Hull (1952, pp. 64-69). If a pigeon’s re- 
sponses are reinforced in the presence of a set of 
stimuli S,, S), S3,...S,, where S, represents a 
1,000-Hz tone, 5. the illumination from the response 
key, and S, the illumination from the houselight, a 
generalization test to other frequencies of the tone, 
S,’, S;" etc., will vary only S, and leave all other stim- 
uli, So, Se, . . . S, constant. To the extent that some 
of these other stimuli have gained control over re- 
sponding, they will continue to control a high rate of 
responding on all test trials. Their presence, there- 
fore, may mask the control actually gained by the 
tone. 

As was briefly noted earlier, a pigeon reinforced for 
pecking at a key containing a white line on a colored 
background may show relatively little control by the 
line when tested with other orientations shown on the 
same-colored background, but produce a steep gradi- 
ent of generalization if the lines are displayed with- 
out colors on a black background (Freeman & ‘Thomas, 
1967; Newman & Benefield, 1968). ‘Thus some of the 
stimuli displayed on the pigeon’s response key may 
mask the control gained by other stimuli on the key. 
Van Houten and Rudolph (1972) and Rudolph and 
Van Houten (unpublished) have extended these ob- 
servations by showing that stimuli presented on a 
pigeon’s response key may mask control by such rela- 
tively diffuse stimuli as a flow of air or a tone of 
particular frequency. In the former experiment, pi- 
geons were reinforced for pecking a key illuminated 
with white light in the presence of a 30-mph flow of 
air from a source behind the response key. When the 
speed of this airflow was varied between 30 and 0 mph 
in a subsequent generalization test, three out of four 
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birds continued to respond at a relatively constant 
rate, with only one bird showing evidence of reliable 
control by this stimulus. In a second group, however, 
trained to peck an unilluminated key in a dark box, 
all four birds show excellent control by the airflow 
stimulus in the subsequent generalization test. More- 
over, if birds were required to learn a discrimination 
between two different velocities of airflow, they 
learned much more rapidly when the chamber was 
dark than when the key was illuminated with white 
light. ‘Thus the presence of an illuminated response 
key would mask the appearance of control by this 
nonvisual, relatively unlocalized stimulus. 

Rudolph and Van Houten confirmed the conclu- 
sions of this first study in a second experiment using 
auditory stimuli. Birds trained to peck an illuminated 
key in. the presence ofa 1,000-Hz tone generalized al- 
most completely to other frequencies of tone, This 
group, therefore. replicated Jenkins and Harrison's 
(1960) results. As briefly noted above, however, a sec- 
ond group, trained to peck a dark kéy in thé presence 
of a tone, showed a reliable and steep gradient when 
tested with other frequencies. The results are shown 
in Figure 4. 

It is clear that the presence of visual stimuli may 
mask control by stimuli from other medalities in the 
pigeon. The general conclusion suggested by these 
experiments, then, is that failures of stimulus control 
are more plausibly attributed to the presence of other 
stimuli which mask control by the experimenter’s 
stimulus than to the absence of implicit differential 
reinforcement. Variations in entirely diffuse stimuli 
may result in sloping generalization gradients, and 
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Fig. 4. Relative gradients of auditory frequency generalization 
in pigeons after responses to 1,000 Hz have been reinforced, 
either in the dark, or with an illuminated key light. (After 
Rudolph & Van Houten, unpublished data.) 
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the removal of potential masking stimuli reliably in- 
creases the slope of such gradients. 


OVERSHADOWING 


Rudolph and Van Houten’s studies, as they them- 
selves recognized, are open to an alternative interpre- 
tation. Control by the frequency of a tone or the 
velocity of a flow of air may not just be masked by 
the constant presence of the key light during testing. 
The presence of this more salient visual stimulus dur- 
ing acquisition may have prevented such auditory or 
tactile stimuli from acquiring control in the first 
place. Such a possibility was envisaged by Lashley and 
Wade (1946) when they argued that a flat gradient of 
generalization obtained when one feature of the train- 
ing situation was varied might signify that the subject 
had attended to some other feature of the situation 
during training. 

We do not need to subscribe without reserve to 
Lashley and Wade’s theoretical analysis in order to 
accept the possibility of an effect such as this. That 
the presence of a more intense or salient stimulus 
may interfere with the acquisition of control by a 
less intense or salient stimulus was first reported by 
Pavlov (1927, pp. 141-143), who termed the effect 
“overshadowing.” He reported that dogs given clas- 
sical conditioning with a compound CS containing 
one intense and one weak component might show 
essentially no conditioning to the weak component 
presented alone on a test trial, “although it is obvi- 
ous... that the ineffective component . . . could 
easily be made to acquire powerful conditioned prop- 
erties by independent reinforcement outside the com- 
bination” (p. 142). Evidence of overshadowing has 
been reported in a variety of other situations: in CER 
conditioning by Kamin (1969), in discrete-trial simul- 
taneous discrimination learning by Lovejoy and Rus- 
sell (1967), and in discrete-trial successive discrimina- 
tion learning by Miles and Jenkins (1973). 

The principle of masking, as defined here, states 
that the presence of one stimulus, A, may obscure the 
expression of control by a second stimulus, B, even 
though it can be shown (by testing with B in the ab- 
sence of A) that B has acquired significant control 
over responding. The principle of overshadowing 
states that the presence of A may interfere with the 
acquisition of control by B. The distinction between 
the two can best be illustrated by reference to a con- 
crete experiment. Farthing (1972) trained two groups 
of pigeons on a successive discrimination. For one 
eroup, a vertical line on a red background served as 
S+, and a green key light served as S—. For the second 
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group, the color projected onto the key was the same 
on both positive and negative trials, and the only 
stimulus correlated with the availability of reinforce- 
ment was the presence or absence of the line. After 
acquisition, generalization tests were given to other 
orientations of the line, displayed on either a red or a 
black background. The results of these generalization 
tests are shown in Figure 5. The difference in the 
performance of both groups between test trials when 
the lines were shown on a colored background and 
those trials when they were presented on an uncol- 
ored ground may be taken as evidence of a masking 
effect. The presence of the colored background signifi- 
cantly decreased the control over responding displayed 
by the line in both groups. Superimposed on_ this 
effect, however, there is also a clear difference be- 
tween the gradients produced by the two groups, re- 
gardless of the type of test trial. Even when the lines 
were displayed on a black background, subjects for 
whom the difference between positive and negative 
trials in acquisition had been marked both by differ- 
ences in color and by the presence of the line showed 
significantly less control by line than did subjects for 
whom the presence of the line was the only signal for 
reinforcement. The presence of the additional wave- 
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Fig. 5. Relative gradients of line-orientation generalization in 
pigeons trained either on a line-orientation discrimination or 
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N. J. Mackintosh 


length cue during training reduced the control ac- 
quired by the line during this phase of the experi- 
ment. 

In Rudolph and Van Houten’s studies, the illumi- 
nation of the key light was the stimulus that inter- 
fered with control by tones or airflows. ‘The nature of 
this stimulus makes it impossible to determine whether 
they were observing masking, overshadowing, or a 
combination of the two. For in order to prove that 
the key light had actually overshadowed these other 
stimuli, it would be necessary to show that its pres- 
ence on test trials had not merely been masking con- 
trol. But in order to show this, it would be necessary 
to conduct test trials with the key light dark. If 
pigeons have been trained to peck an illuminated key, 
however, they will, unless there is another source of 
illumination, stop pecking when the key is abruptly 
darkened. In the absence of responding, it is impossi- 
ble to assess the degree of control acquired by any 
stimulus. 

This difficulty does net detract from the main cen- 
clusion suggested by this body of research: a major 
reason why some stimuli fail to show significant con- 
trol over responding is that they ave either over- 
shadowed or masked by the presence of more salient 
stimuli. This conclusion has the further virtue, as we 
shall see, of explaining why discrimination training 
is frequently necessary to establish control by rela- 
tively unsalient stimuli. A great deal of rescarch has 
been devoted to an examination of the effects of vari- 
ous discriminative procedures on stimulus control, 
and it is time to turn to this question. 


Intradimensional Discrimination Training 


Although, as was argued above, differential rein- 
forcement within the experimental situation may not 
be necessary for the acquisition of stimulus control, 
this should not be taken to imply that such difteren- 
tial reinforcement has no effect on control. On the 
contrary, discrimination training has powerful and 
important effects on generalization. 

Pavlov (1927) reported the most obvious instance 
of this effect of discrimination training on generaliza- 
tion. If a particular tone was established as a classical 
CS for food, the presentation of other tones would 
also elicit salivary CRs. In order to prevent the oc- 
currence of such generalized CRs, Pavlov stated, it was 
necessary to continue reinforcement in the presence of 
the original CS and to present the other tones without 
reinforcement. Discriminative conditioning between 
neighboring stimuli would thus sharpen the gradient 
of generalization between them. Numerous other 
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studies have examined the effects of providing dis- 
crimination training between two stimuli falling along 
a particular dimension on subsequent generalization 
to other values of that dimension. Such intradimen- 
sional discrimination training has sharpened gradi- 
ents of auditory frequency generalization in experi- 
ments on galvanic skin response (GSR) conditioning 
in human subjects (Hovland, 1937), eyelid condition- 
ing in rabbits (Moore, 1972), and key pecking in 
pigeons (Jenkins & Harrison, 1962), and similar re- 
sults have been reported for brightness generalization 
in rats (Schlosberg & Solomon, 1943) and wavelength 
generalization in pigeons (Honig, 1962; Honig, 
‘Thomas, & Guttman, 1959). ‘The results of the study 
by Honig et al. aré shown in Figure 6. 

The generally accepted explanation of this result 
has been some version of that preposed by Pavlov 
himself. Reinforcement in the presence of one stimu: 
lus and nonreinforcement in the presence of another 
are said to decrease responding to stimil: falling be- 
tween §+ and §- because the tendency te respond 
produced by reinforcement at 5+, which generalizes 
to these intervening stimuli, is counteracted by a 
tendency not to réspond, produced by nonreintorce- 
ment at §-, which alse generalizes to the intervening 
stimuli. The resulting postdiscrimination gradient is 
a consequence of the interaction between the “excitas 
tory” gradient centered round $+ and the “inhibi- 
tory” gradient centered round 5-, an analysis fret 
formally proposed by Spence in 1937. 

There is, however, good reason td believe that 
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such an interaction between hypothetical excitatory 
and inhibitory gradients, although of considerable 
importance, is not sufficient to account for all features 
of the postdiscrimination gradient. As Figure 6 shows, 
the effect of nonreinforcement at S— is not only to 
increase the slope of the gradient between S+ and S-, 
it also sharpens the gradient on the other side of 
S+. This decline in responding to stimuli so far re- 
moved from S~— is difficult to attribute to the gen- 
eralization of any inhibitory tendency not to respond 
to S~. A similar effect can be observed in several 
other studies of intradimensional discrimination train- 
ing (e.g., Hanson, 1959; Jenkins & Harrison, 1962; 
Moore, 1972). It seems probable that discrimination 
training has further effects responsible for this addi- 
tional sharpening of generalization gradients. This 
conclusion is amply confirmed by studies of interdi- 
mensional discrimination training, where changes in 
the slope of postdiscrimination gradients cannot be 
attributed to any interaction between excitatory and 
inhibitory tendencies. 


Interdimensional Discrimination Training 


Jenkins and Harrison (1960) were the first to use 
interdimensianal training to examine the effects of 
differential reinforcement on the slope of a generaliza- 
tion gradicnt, uncomplicated by interactions between 
excitatory and inhibitory tendencies. They trained 
pigeons on a discrimination between the presence and 
absence of a 1,000-Hz tone and then tested for gen- 
eralization along the dimension of auditory frequency. 
Although, as noted above, nondifferential reinforce- 
ment resulted in a relatively flat frequency gradient, 
reinforcement in the presence of the tone, randomly 
alternated with nonreinforcement in its absence, re- 
sulted in steep and reliable gradients in all subjects. 

Jenkins and Harrison’s results have been confirmed 
and extended in a number of subsequent. studies. 
Newman and Baron (1965), Switalski, Lyons, and 
Thomas (1966), and Lyons and Thomas (1967) have 
shown that giving pigeons training between the 
presence and absence of a line or specific wavelength 
on the response key significantly sharpens generaliza- 
tion gradients of orientation or wavelength. Moore 
(1972) has reported that rabbits given differential eye- 
lid conditioning with a tone as CS+ and a light as 
CS~ show a steeper gradient of auditory frequency 
generalization than a control group simply given 
reinforced trials with CS+ alone. Several studies of 
conditioned reinforcement in rats may be taken to 
imply a similar effect (e.g., Notterman, 1951). There 
is No question, then, but that differential reinforce- 
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ment between the presence and absence of a specific 
stimulus will increase the slope of a generalization 
gradient measured when some feature of that stimulus 
is subsequently varied. 

The important feature of interdimensional training 
is that the stimulus correlated with nonreinforcement 
during initial acquisition is presumably equidistant 
from all test stimuli. Thus the increase in the slope 
of the postdiscriminative gradient cannot be attrib- 
uted to the differential generalization of inhibitory 
tendencies to different stimuli. Jenkins and Harrison 
accepted that the type of analysis proposed by Hull 
(1952) provided the most plausible explanation of 
their data. As noted above, Hull’s argument amounted 
to saying that if a pigeon were nondifferentially rein- 
forced for responding in the presence of a 1,000-Hz 
tone, then although the frequency of the tone might 
be established as a signal for reinforcement, the po- 
tential control acquired by the tone might be masked 
by innumerable other features of the situation, each 
of which was as well correlated with the delivery of 
reinforcement as was the tone itself. These other 
features remain constant during a test for auditory 
frequency generalization, and would therefore main- 
tain a constant rate of responding. By providing 
differential reinforcement between the presence and 
absence of the tone, the experimenter insures that at 
least some of these other features are now less well 
correlated with reinforcement than is the frequency 
of the tone. If these differences in the “validity” of 
different features result in reliable differences in their 
control over responding, then interdimensional train- 
ing may sharpen generalization gradients because it 
effectively reduces control by incidental stimuli that 
would otherwise mask the stimuli varied by the ex- 
perimenter,? 

Rudolph and Van Houten’s unpublished experi- 
ment, described earlier, provides the first line of evi- 
dence to support this analysis. In their replication of 
Jenkins and Harrison’s study they showed that it was 
the presence of visual stimuli, such as the key light, 
that prevented the tone from exercising control over 
responding; pigeons trained and tested in the dark 
showed orderly and steep gradients of auditory fre- 
quency generalization. 

The second point that requires investigation is 
whether interdimensional auditory discrimination 
training can indeed suppress control by such masking 


2 As noted above, a particular stimulus may fail to control 
behavior after nondifferential reinforcement not because its 
control is masked by other, unvarying stimuli, but because it is 
overshadowed by these other more salient stimuli. This possi- 
bility does not, of course, affect the present argument. 


N. J. Mackintosh 


visual stimuli. Miles, Mackintosh, and Westbrook 
(1970) trained pigeons on a discrete-trial discrimina- 
tion with a tone as St+ and white noise as S—, but 
with the response key illuminated with light of a par- 
ticular wavelength on both positive and negative 
trials. Key color, therefore, served as a potential mask- 
ing stimulus, and, indeed, when subjects were ini- 
tially given S*+ trials only, they showed strong control 
by the color of the key, responding consistently on 
test trials to the color they were trained with and not 
responding when the color was changed. The effect 
of auditory discrimination training, however, was to 
weaken this control by color; after nine sessions of 
tone-noise discrimination training, subjects responded 
consistently only to the tone and showed a signifi- 
cantly greater tendency to respond on test trials to 
the changed key color. 

The results of Miles et al. have been replicated in 
subscquent unpublished experiments by Miles. Once 
again interdimensional auditory discrimination train: 
ing led to a significant flattening of the generalization 
gradient obtained when the color of the key was 
changed; in this replication, morcever, subjects were 
tested in silence, so that the decrease in control by 
color could not have been due to any increase in con- 
trol by tone. Blough (1969) has also reported data 
showing that a stimulus common to both positive 
and negative trials of a discrete-trial discrimination 
will lose control over responding. It is reasonable, 
then, to argue that differential reinforcement between 
any arbitrary pair of stimuli will tend to reduce con- 
trol by other stimuli common to both positive and 
negative trials, and that the frequently observed 
sharpening of generalization gradients resulting from 
such discrimination training is a consequence of this 
suppression of control by such incidental stimuli 
which might otherwise act as effective masking stim- 
ull. 

Even if this is accepted as an explanation of the 
effects of interdimensional training, it must be clear 
that the principle itself stands in need of explanation. 
Why should discrimination training between the 
presence and absence of a tone, for example, reduce 
the control exercised by visual stimuli common to 
both positive and negative trials? Hull argued that 
this was simply a consequence of the new schedule of 
reinforcement correlated with such incidental stim- 
uli. If we schematize the tone as T and the light as 
L, then nondifferential reinforcement in the presence 
of the tone consists of a series of TL+ trials, while 
interdimensional training consists of a series of TL+ 
trials alternating with L— trials. In the former case, 
L is consistently reinforced; in the latter, L is equally 
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often reinforced and not reinforced. This deteriora- 
tion in the schedule of reinforcement associated with 
L should, Hull argued, be sufficient to reduce the 
control it acquires. 

As Jenkins and Harrison (1960) and Wagner (1969) 
have pointed out, this is a distinctly implausible sug- 
gestion. By comparison with consistent reinforcement, 
it is true, differential reinforcement results in a de- 
cline in the correlation of an incidental stimulus with 
reinforcement, but it is unlikely that this decline 
would in and of itself be sufficient to have any drastic 
effect on the control such a stimulus acquires. Even 
the relevant stimuli in typical free operant schedules 
are only intermittently correlated with reinforcement: 
and it is well known that in discrete-trial situations 4 
509, schedule of reinforcemént is sufhcient to estab- 
lish highly reliable responding, An important Series 
of studies by Wagner, Logan, Haberlandt, and Price 
(1968) has confirmed that in such discrete-trial situas 
tions the schedule of reinforcement associated with an 
incidental stimulus common to positivé and négative 
trials of a discrimination is not, as such, sufficicnt te 
explain why such a stimulus fails to acquire control 
over responding. Three experiments were conducted, 
one employing inctrumental discrimination learning 
in rats, a second esnditisned SUpPPYEssion in ¥ats, and 
the third eyelid conditigning in rabbits; but the basis 
design of each of these experiments was identical. For 
subjects in the discrimination group (hereinafter 
called Group TD, for true discrimination), reinforced 
trials to a tene-light cempeund (T,L+) alternated 
with nonreinforced trials to another tone-light com: 
pound (1,7). The light, it should be noted, was 
commen to beth positive and negative trials, and 
when tested with L alone, subjects showed little or no 
tendency to respond. Discrimination training between 
T; and Ty, had apparently prevented L acquiring sig: 
nificant control. Instead of comparing this discrimina- 
tion group with a nondifferentially reinforced group, 
however, Wagner et al. used a control group that also 
received T,L and T.L trials and also received rein- 
forcement on only 50% trials. The important differ- 
ence for this group (hereinafter called Group PD, for 
pseudodiscrimination training) was that the delivery 
of reinforcement was uncorrelated with T, and T,. In 
this PD group, L acquired strong control over re- 
sponding in spite of the fact that the actual schedule 
of reinforcement associated with L was exactly the 
same as in Group TD. Thus it is not the schedule of 
reinforcement associated with an incidental stimulus 
during discrimination training that suppresses control 
by such a stimulus, but the fact that there are other 
stimuli better correlated with reinforcement. It is not 
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its absolute validity or correlation with reinforcement 
that is the most important determinant of the control 
gained by a stimulus, but its relative validity com- 
pared to that of other available stimuli. 

This conclusion is of profound importance, for it 
implies some interaction between the control over 
responding acquired by different stimuli. If incidental 
stimuli fail to acquire control, not simply because 
they are imperfectly correlated with reinforcement, 
but because other stimuli are better correlated, this 
suggests that stimuli may compete for the acquisition 
of control. In the experiments of Wagner et al. the 
auditory stimuli acquired little control over respond- 
ing in the PD groups, thus enabling the light to gain 
substantial control. In the TD groups, on the other 
hand, the auditory stimuli, being perfectly correlated 
with reinforcement, acquired strong control and thus 
prevented the acquisition of control by the light. We 
may say that the light failed to gain control because it 
was overshadowed by a better predictor of reinforce- 
ment. 

The term overshadowing is usually used in Pav- 
lov's original sense to refer to the effect of the pres- 
ence of a salient stimulus on the acquisition of control 
by an equally valid but less salient stimulus. The ex- 
tension of the term to cover the case where, of two 
equally salient stimuli, the more valid may interfere 
with the acquisition of control by the less valid im- 
plies a parallel between the two effects. There cer- 
tainly seems to be some resemblance, and, as we shall 
see later, similar theoretical analyses have been ap- 
plied to both effects. Whether differences in validity 
have the same effect as differences in salience, how- 
ever, is for present purposes less important than the 
acceptance of the general principle that it is not the 
absolute validity of a stimulus that determines its 
control, but whether it is accompanied by other, more 
valid predictors of reinforcement. 

This principle provides the most plausible inter- 
pretation of the effects of interdimensional discrimi- 
nation training on the acquisition of stimulus control. 
Differential reinforcement will result in the over- 
shadowing of a potentially wide range of stimuli com- 
mon to positive and negative trials, which might 
otherwise mask, or even themselves overshadow, the 
stimuli in which the experimenter is interested. There 
is, however, one very important corollary to this analy- 
sis. Discrimination training between the presence and 
absence of a discriminative stimulus may indeed re- 
sult in the overshadowing of all incidental stimuli, 
but this does not guarantee that the aspect of the 
discriminative stimulus subsequently varied in a gen- 
eralization test will be that which gains control over 
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the subjects’ behavior. One more salient aspect may 
still overshadow another. If, therefore, discrimination 
training is given between the presence and absence of 
a compound stimulus, not all features of that com- 
pound will necessarily acquire good control. The ex- 
periment by Farthing (1972), described earlier, showed 
that if pigeons receive differential reinforcement be- 
tween a vertical line on a red key and no line on a 
green key, the presence of the wavelength difference 
between positive and negative trials significantly re- 
duced the control over responding acquired by the 
line. 

It is even more important to note that overshadow- 

ing of one aspect of a discriminative stimulus by an- 
other during interdimensional training may not re- 
quire the explicit use of a compound stimulus. An 
experimenter may give interdimensional training be- 
tween the presence and absence of a vertical line and 
then test for control by the line by varying its orienta- 
tion on a series of test trials. But the line may be 
characterized in many other ways—as having, for ex- 
ample, a particular size, height, width, and bright- 
ness. There is no guarantee that the feature varied by 
the experimenter will be the one to have gained con- 
trol over the behavior of the subject. Interdimen- 
sional discrimination training may still not insure 
control by the particular feature varied during the 
generalization test. 
_ A number of studies illustrate the validity of this 
line of reasoning. Boneau and Honig (1964) found 
that when pigeons were given conditional discrimina- 
tion training with one of the conditional stimuli be- 
ing the presence or absence of a vertical line on the 
key, they still showed a relatively flat gradient in a 
subsequent generalization test along the dimension of 
line orientation. Williams (1973) found that interdi- 
mensional discrimination training between the pres- 
ence and absence of a series of clicks emitted at a rate 
of 2.45 per sec was not sufficient to produce a sloping 
gradient to other click frequencies. Since pigeons were 
well able to learn a discrimination between two click 
rates, and, having done so, showed a reliable and 
steep gradient of generalization when tested with 
other rates, the failure of interdimensional training 
to establish control by click rate cannot be attributed 
to an inability to detect such differences. As Williams 
argued, it is most plausibly regarded as a consequence 
of overshadowing; other features of the clicks, such as 
their intensity or individual frequencies, were as well 
correlated with reinforcement, as was their rate, and 
by virtue, presumably, of their greater initial salience, 
may have overshadowed, or at least masked, control 
by rate. 
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Unfortunately, neither Boneau and Honig nor 
Williams tested the prediction that other features of 
their discriminative stimuli had acquired control over 
responding. Williams did not measure generalization 
along the dimension of intensity, nor Boneau and 
Honig along the dimensions of brightness or line 
length. An experiment by Mackintosh (1965), how- 
ever, provides some relevant evidence. Rats received 
interdimensional training between the presence and 
absence of a white circle of a particular size displayed 
on the window of a jumping stand. Subsequent tests 
between the original circle and one of a different size 
revealed relatively poor control by the specific size of 
the circle used in original training. The feature of the 
situation that had gained most control over respond. 
ing was the brightness difference between a door con- 
taining a white circle and one containing no circle. 
Animals trained with the circle positive, showed a 
strong preference for the larger (i.¢., brighter) of twe 
circles. regardless of their absolute sizes; while an- 
imals trained to respond to a blank door, with the 
white circle negative, showed a stronger preference for 
the smaller (i.c., less bright) of two circles, again re- 


gardless of their absolute sizes. 


Extradimensional Training 


In intradimensional training, subjects are exposed 
to differential reinforcement correlated with stumuli 
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differing along the dimension subsequently varied in 
a generalization test. In interdimensional training, 
differential reinforcement is correlated with the pres- 
ence or absence of a specific stimulus, some aspect of 
which is then varied in a generalization test. But dis- 
crimination training can be programmed between a 
pair of stimuli quite unrelated to the set of stimuli 
varied during generalization testing. One can examine 
the effect of discrimination training between different 
wavelengths on the acquisition of control by a line or 
by a tone. Such a test requires, of course, that re- 
sponding at some point be reinforced in the presence 
of the line or tone. Two procedures which have been 
adopted for the provision of such reinforced experi- 
ence are illustrated in Table 1. In the first, successive- 
stage procedure, subyjacts ara initially trained on, cay, 
a wavélensth discrimination and are then reinforced 
for responding to a vertical ling. In the second, con- 
current precedure, the ling is present during the 
course of wavelength discrimination training. appears 
ing on both pocitive and negative triale. Since the 
twos procedures may pose vather different problems 
for theoretical analysis, they will be treated sep- 
arately, 


SUGCESSIVE-STAGE EXTRADIMENSIONAL 
EXDERIMENTS 


‘That extradimanssnal training between sone pair 
ef stamuli might cnhance the centrel apparently 
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gained by an entirely different stimulus was first 
clearly shown by Honig (1969). Honig trained one 
group of pigeons on a discrimination between differ- 
ent key colors (Group TD), while a second group 
received the same sequence of stimuli uncorrelated 
with reinforcement (Group PD). Both groups were 
then reinforced for pecking at a set of vertical lines 
on the response key and finally received a generaliza- 
tion test along the dimension of orientation. Group 
IT'D showed a significantly steeper gradient than 
Group PD. Results clearly related to Honig’s have 
been reported by Eck, Noel, and ‘Thomas (1969), who 
found that pigeons given PD training with one set of 
stimuli in stage I learned a new discrimination be- 
tween a new set of stimuli in stage 2 significantly 
faster than subjects receiving PD training in stage 1. 
Similarly, Frieman and Goyette (1973) confirmed that 
training on one discrimination would facilitate the 
learning of a second, independent problem. 

The argument of the preceding section was that 
discrimination training sharpens generalization gradi- 
ents by effectively neutralizing incidental stimuli that 
would otherwise interfere with the acquisition or ex- 
pression of control by the test stimulus. At first sight, 
Honig’s (1969) results seem inconsistent with any 
such analysis: it is hard to see why wavelength dis 
crimination training should have had any effect on 
control by a subsequently présented stimulus. Wagner 
(1969), however, has shown how this type of analysis 
may be relevant to the understanding of extradimen- 
sional training. The critical assumption is that there 
may be incidental situational stimuli present during 
all stages of training. If these stimuli are neutralized 
during initial TD training, and if this effect transfers 
to stage 2 of the experiment, they will no longer com- 
pete for control with the new set of discriminative 
stimuli manipulated by the experimenter. In PD 
groups, on the other hand, such situational stimuli 
will gain control of responding in stage 1 and con- 
tinuc to control behavior in stage 2. Wagner (1969) 
reported the results of an experiment on eyelid condi- 
tioning in rabbits, which provided evidence of just 
such an effect. ‘The design of his experiment is shown 
in ‘Table 2. A TD group was given discrimination 
training between two lights (L, and Ly); on separate 
trials they also received reinforcement signaled by a 
tone (IT). ‘This procedure produced steeper auditory 
gradients around T than those obtained from a PD 
group, which was treated identically in the presence 
of T but for whom L, and L, were randomly associ- 
ated with reinforcement. These results, therefore, 
replicated those obtained by Honig. Wagner also, 
however, provided an explicit incidental vibratory 
stimulus (V) common to all trials; thus the actual 
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Table 2 Design of Experiment by Wagner (1969) 


GROUP TRAIN TEST 
TD 1,V+,L,V-,T,;V+ TT), T,, ete. 
PD L,V+, L,V+,T,V+ Ty, To, ete. 


L,, L, = Lights; T,, T., = Tones; V = Vibra- 
tory stimulus 


stimuli to which subjects were exposed were com- 
pounds, L,V, L,.V, and TV. Test trials with V alone 
indicated that it had acquired less control over re- 
sponding in Group TD than in Group PD. This 
feature of Wagner’s design thus enabled him to show 
that the increase in control by T in Group TD was 
accompanied by a decrease in control by V. 
Wagner’s results make it possible to maintain that 
discrimination training always increases control by 
one set of stimuli by suppressing control by others. 
The only new assumption required is that the inci- 
dental stimuli suppressed during the course of dis- 
crimination training remain suppressed when the 
original discriminative stimuli are removed. A minor 
modification of Honig’s design, however, produces 
data which are more problematic. Thomas, Freeman, 
Svinicki, Burr, and Lyons (1970, Experiments | and 2) 
confirmed Honig’s finding that birds given TD color 
training before reinforced exposure to a vertical line 
would show a steeper gradient to other orientations 
of the line than a PD group.* In the experiments of 
Thomas et al., however, unlike Honig’s, the vertical 
line was shown in stage 2 compounded with one of 
the discriminative stimuli from stage 1. Birds initially 
given ‘I'D or PD training with green and red stimuli, 
with green signaling a variable-interval (VI) schedule 
for the TD group, were reinforced in stage 2 for re- 
sponding to a vertical line superimposed on a green 
background, before being tested for generalization to 
other orientations of the line on a black background. 
Although this change in procedure appears rela- 
tively minor, it might be expected to have had sub- 
stantial consequences. In analyzing Honig’s results, it 
was suggested that I'D training might suppress con- 
trol by incidental situational stimuli, and that this 
loss of control by potentially competing stimuli might 
then enable the vertical line to gain more control 


3Thomas et al. also obtained similar results when TD and 
PD training were given with two different line orientations, and 
subjects were then reinforced for responding to a single wave- 
length, followed by a generalization test to other wavelengths. 
For ease of exposition in what follows, however, it will be 
simpler to concentrate on their first experiment and assume 
that IT'D and PD training is given with two wavelengths and that 
subsequent training is given with a vertical line. 
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over responding than in the PD group. Wagner’s ex- 
periment confirmed that an explicitly manipulated 
incidental stimulus would indeed gain less control 
over responding if presented in conjunction with 
stimuli correlated with reinforcement and nonrein- 
forcement (in a TD group) than with stimuli uncor- 
related with reinforcement (in a PD group). The 
change in procedure introduced by Thomas et al., 
however, involved presenting in stage 2 the vertical 
line itself in conjunction with one of the discrimina- 
tive stimuli of stage 1. This might be expected to have 
reduced the control acquired by the vertical line in 
the IT'D group. The principle of overshadowing ex- 
emplified in Wagner’s data implies that TD training 
will result in the suppression of control by any stim- 
ulus presented in conjunction with the relevant dis- 
criminative stimuli. Although this may include situ- 
ational stimuli, it is hard to see why it should not also 
have included the vertical line in the experiment of 
Thomas et al. Nevertheless, in that experiment, as in 
Honig’s, TD training resulted in an apparent increase 
im control by the vertical line. 

It is not only a theoretical principle of possibly 
limited importance, such as that of overshadowing, 
and the results of rather different experiments, such 
as Wagner's, that appear to conflict with the data of 
Thomas et al, For the design of their experiments 1s 
in fact very similar to the design of experiments on 
“blocking,” and blocking has been reliably observed 
in studies of free operant discrimination learning in 
pigeons, with designs extremely similar to that em- 
ployed by Thomas et al. Johnson (1970), for example, 
initially trained pigeons on a vertical-horizontal dis- 
crimimation and then gave compound discrimination 
training with the vertical line superimposed on a 
blue backeround and the horizontal line superim- 
posed on a yellow background, In a subsequent gen- 
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eralization test such birds showed reliably less control 
by wavelength than a control group that had re- 
ceived no training on the vertical-horizontal discrimi- 
nation in stage 1. Thus prior discrimination training 
on the vertical-horizontal problem reduced rather 
than enhanced the control by stimuli subsequently 
compounded with the original discriminative stimuli. 
Although similar blocking effects have not always 
been observed in similar experiments with pigeons 
(e.g., Farthing & Hearst, 1970), there is no doubt that 
the majority of similar studies have reported similar 
results (Chase, 1968; Miles, 1970; vom Saal & Jenkins, 
1970). 

Why should pretraining on one component of a 
compound sometimes reduce control by a second com- 
ponent, as in studies of blocking, and sometimes en: 
hance control by a second component, as in the studies 
by ‘Thomas et al.? ‘There are, in fact, several differ 
ences between the design of the two types of experi- 
ment. Thomas et al. compared groups given TD or 
PD training in stage 1, while Johnson compared a TD 
group with an untreated control proup. An experi- 
ment by Freeman (1967, cited by Honig, 19705, how- 
cver, suggests that in this situation at least, chis differ- 
ence is of no consequence.* A second difference is in 
the treatment of all subjects in stage 2. In experi- 
ments where TD training enhances centrel by the 
added component, animals are exposed to a single 
compound stimulus and reinforced for responding in 
its presence; in experiments where blocking is ob- 
served, animals receive discrimination training be. 
tween pairs of compound stimuli, Mackintosh and 
Honig (1970) have explicitly compared these two 
procedures by running in a single study the two pairs 
of group shown in Table 5. As can be seen from Fig- 


4 Whether thie ig alwaye true it 2 quection discueced later. 


Table 3 Design of experiment by Mackintosh and Honig (1979) 


GROUP STAGE 1 STAGE 2 TEST 

TD Vien 

Wavelensth 

Blocking VBt+, HY— 

generalization 
Control 
TD Von 

Wavelength 

Enhancement VBt 

generalization 

Control 
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Fig. 7. Relative gradients of wavelength generalization in pi- 
geons. The groups whose data are shown in the left panel 
received discrimination training between vertical and horizon- 
tal lines superimposed on backgrounds of 501 and 576 nm: 
those in the right panel were reinforced for responding to a 
vertical line on the 501-nm_ background. (After Mackintosh & 
Honig, 1970. © 1970 by the American Psychological Association. 
Reprinted by permission.) 


ure 7, they found it possible to replicate both John- 
son’s finding of blocking and the finding of enhance- 
ment by Thomas et al. When stage 2 involved a 
discrimination between two lines shown on a colored 
background, pretraining on the line-tilt discrimina- 
tion reduced the control acquired by color. When 
Stage 2 involved the nondifferential reinforcement of 
responding to a vertical line on a colored background, 
however, pretraining on the line-tilt discrimination 
tended to enhance the control gained by color. It 
seems probable, therefore, that the decisive factor 
determining the outcome of such experiments is 
whether stage 2 involves reinforcement for responding 
in the presence of a single compound or discrimina- 
tion training between two compound stimuli. 

The occurrence of blocking is entirely consistent 
with the principle of overshadowing. In studies of 
overshadowing, the more salient or valid member of a 
compound stimulus may reduce the control acquired 
by the other component. In blocking, the validity of 
one element is increased by previously establishing it 
as a signal for reinforcement, and such pretraining 
reduces the control acquired by the other element. 
Why, then, should pretraining on one component ap- 
parently increase the control acquired by the other 
in the enhancement design? The answer must be re- 
lated to Mackintosh and Honig’s finding that en- 
hancement occurred instead of blocking only when 
subjects received nondifferential reinforcement for 
responding to the stimulus subsequently varied dur- 
ing generalization testing. The control group in this 
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design, therefore, received no discrimination training 
at any stage of the experiment and, as can be seen 
from Figure 7, gave a flatter gradient than any other 
group in the experiment. If, as we have argued, dis- 
crimination training suppresses control by incidental 
stimuli, it is possible that the flat gradient shown by 
this control group was a consequence of masking by 
these incidental stimuli. The suppression of inciden- 
tal stimuli in the TD group by prior training on the 
line-tilt discrimination, then, may have been more 
than enough to compensate for the partial overshad- 
owing of the test stimulus that resulted from its pre- 
sentation in conjunction with the more valid line tilt. 
In the blocking design, on the other hand, both TD 
and control groups received discrimination training 
in stage 2. In neither group, therefore, should inciden- 
tal stimuli have succeeded in masking control by 
wavelength, and the only effect observed was the 
overshadowing of wavelength by the previously trained 
component. 

This may well seem an unduly elaborate analysis. 
In particular, one could take objection to the argu- 
ment that enhancement occurs because the overshad- 
owing of potentially masking situational stimuli is 
sufhcient to outweigh the overshadowing of the test 
stimulus. Against this, however, it can reasonably be 
insisted that if such apparently contradictory results 
as enhancement and blocking depend upon relatively 
minor differences in experimental procedure, an ade- 
quate analysis is likely both to be complex and to 
appeal to a conflict between opposing processes. It is 
possible, nevertheless, that enhancement is not a con- 
sequence of differences between the overshadowing of 
incidental and test stimuli. It may be necessary to 
appeal to an entirely new set of principles, Thomas 
(1970) has argued that enhancement indicates the 
operation of a much more general process of attentive- 
ness which may affect the control acquired by any set 
of stimuli. Discrimination training, he suggests, may 
insure that animals learn 


the validity of external stimuli as signifying 
events or contingencies of significance for the 
welfare of the organism. In this way the benefits 
of discrimination training would not be specific 
to the dimension or dimensions varied in train- 
ing but might generalise to other aspects of the 
training stimulus as well. By the same token, 
nondifferential training might serve to teach the 
animal the insignificance of external stimuli 
and/or the futility of behaving differentially in 
their presence, and this learning might gener- 
alise to stimuli not involved in the initial train- 


ing. (p. 324) 
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Some light on these different interpretations may be 
shed by examining a further set of experimental re- 
sults, those obtained in the second type of extradi- 
mensional study referred to in Table 1. It is time to 
consider these studies. 


CONCURRENT EXTRADIMENSIONAL 
EXPERIMENTS 


In successive extradimensional experiments, the 
stimulus whose control is assessed is presented sep- 
arately from the extradimensional discriminative stim- 
uli. In concurrent extradimensional experiments, the 
test stimulus is presented, in conjunction with the 
discriminative stimuli, during the course of discrimi- 
nation training. It is, in other words, an explicit, in- 
cidental stimulus, common to positive and negative 
trials, Several recent studies of free operant discrimi- 
nation learning by pigeons have examined the effect 
of such extradimensional training on the acquisition 
of control by such a stimulus and have yielded rela- 
tively consistent results, Thomas et al. (1970, Experi- 
ments 3 and 4), for example, gave pigeons TD or PD 
training with two wavelengths, with a vertical line 
appearing on the response key on all trials: they then 
tested the birds for generalization to other erienta- 
tions of the line presented on a black background. 
Just as TD wavelength training had enhanced control 
by a subsequently presented line, so in these studies 
TD training éenhancéd control by a line present dur- 
ing the course of TD training. 

Thomas (1970) has argued that this finding pro- 
vides definitive evidence that any principle of over- 
shadowing must at best be subservient to a much 
more important general effect of discrimination train- 
ing. There is, of course, no doubt that these results 
are not what the principle of overshadowing would 
lead one to expect. This is hardly surprising, for they 
are the exact opposite of the results reported by Wag- 
ner et al. (1968), which earlier provided the impetus 
for the application of the principle of overshadowing 
to the effects of differential reinforcement. Wagner 
et al. concluded that the control acquired by a stim- 
ulus reinforced on 50% of trials was adversely affected 
by the presence of other, more valid signals of rein- 
forcement. In their experiments, TD training between 
two tones reduced the control displayed by a light 
common to positive and negative trials. The design of 
their studies is exactly the same as that of Thomas 
et al., with tones instead of wavelengths, and a light 
serving as the incidental stimulus instead of a vertical 
line. And yet the two sets of studies produced dia- 
metrically opposed results. 
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Wagner et al. replicated their findings in three 
separate experiments. ‘There is equally no doubt about 
the reliability of the results of Thomas et al. They 
have been confirmed in subsequent studies by Bresna- 
han (1970) and Turner and Mackintosh (1972) and in 
a series of unpublished experiments conducted at 
Dalhousie University by Dr. V. Gray. An understand- 
ing of the causes of this discrepancy, therefore, is a 
necessary prerequisite to any adequate theoretical 
analysis. 

In pursuit of his argument that the major effect of 
discrimination training is an increase in general at- 
tentiveness, insuring an increase in control by all stim- 
uli, ‘Thomas (1970) has sought to dispute the validity 
of the data given by Wagner et al, and has suggested 
that their conclusions are a mistaken inference from 
an inappropriate test procedure. Wagner et al. as- 
sessed the degree of control gained by the incidental 
visual stimuli in their experiments by measuring the 
amount of responding that occurred when that stim- 
ulus was presented alone, without the auditory dis: 
criminative stimuli. In the experiments of Thomas et 
al. and in subsequent replications of their work, the 
control exercised by the incidental stimulus has been 
assessed by varying some aspect ef that stimulus and 
measuring the slope of the resulting generalization 
gradient. It is possible that these measures might not 
coincide. Subjects in a PD group might réspond at a 
higher tate to the incidental stimulus presented alenc, 
because they were less disrupted than were TD sub: 
jects by the removal of the discriminative stimuli. 
Simultaneously, however, PD subjécts might also re- 
spond at a substantially higher rate when some 
aspect Of the incidental stimulus was varied in a gen- 
eralization test. ‘The former measure was taken by 
Wagner et al. to imply stronger control by the inci- 
dental stimulus. The latter might imply less. 

Thomas, Burr, and Eck (1970) trained rats in a free 
operant situation with results that appeared to pro- 
vide some support for this argument. Rats were 
trained to press a lever in the presence of two com- 
pound stimuli, T,L, and T.L,; for TD animals 
T,L, signaled a VI schedule of reinforcement and 
T,L, signaled extinction; for PD animals each com- 
pound signaled reinforcement and extinction equally 
often. As is shown in Figure 8, when subjects were 
tested with L, alone and with a dimmer light, L,, TD 
animals responded significantly less to L, than did PD 
animals, thus apparently showing less control by the 
light and confirming the results of Wagner et al. How- 
ever, since they also responded very much less to L. 
than did PD animals, they in fact made a higher pro- 
portion of their total test responses to L, than did the 
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Fig. 8. Effects of true (TD) and pseudo (PD) auditory discrimi- 
nation training on generalization to different light intensities. 
The first and second panels show absolute and relative gradients 
in rats tested to the lights alone, where L, is the light intensity 
used in training. The third panel shows * absolute gradients of 
generalization when rats were tested with the lights shown in 
conjunction with the auditory stimuli, where T, was S* and 
I, was S- for the TD group. (After Thomas et al., "1970. © 1970 
by the American Psychological Association. Reprinted by per- 
mission.) 


PD animals. On this measure, therefore, they showed 
a steeper relative gradient along the dimension of 
light intensity and may be said to have shown more con- 
trol by the light. Thomas, Burr, and Eck thus claimed 
that there was no real discrepancy between the results 
reported by Wagner et al. and those originally re- 
ported by Thomas et al. They further argued that the 
only proper measure of control by an incidental stim- 
ulus is the slope of a relative generalization gradient 
when some feature of that stimulus is varied, and that 
the reason why TD animals in their experiment had 
responded at a lower rate to L, than did PD animals 
was simply because they were more disrupted by the 
removal of the auditory discriminative stimuli. They 
thus argued that the results of all of these studies were 
consistent with the proposition that TD training in- 
creases control by an incidental stimulus common to 
positive and negative trials. 

‘There are, however, features even of their own data 
that suggest some caution in accepting ‘Thomas, Burr, 
and Eck’s conclusion, and the results of other experi- 
ments leave little doubt that neither their arguments 
nor their data can be accepted in their entirety. In the 
first place, there are grave problems involved in the 
interpretation of relative generalization gradients 
when these are based on widely differing absolute 
rates of responding. As Figure 8 shows, in Thomas, 
Burr, and Eck’s experiment there is little or no differ- 
ence between TD and PD groups in the slope of the 
absolute gradient. ‘The reason why the relative gradient 
is steeper for the TD group is that it is derived from 
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a lower absolute level of responding. To say that this 
represents a greater difference in the “true” response 
strength to L, and L, is, in effect, to say that the 
difference between 280 and 150 responses is really 
greater than the difference between 600 and 480. 
There is certainly no a priori reason why this should 
be true, and it is not difficult to think of reasons, such 
as a ceiling effect obscuring the true response strength 
to L, in Group PD, which would suggest exactly the 
opposite conclusion. Furthermore, as is also shown 
in Figure 8, when another pair of TD or PD groups 
was tested with L, and Lg, but this time presented in 
conjunction with T, and T., the PD animals now 
appear to have shown a steeper absolute gradient 
than the ITD animals. Thomas, Burr, and Eck ignore 
these data, and merely assert, without citing statistical 
support, that “subjects responded approximately as 
much to compounds including L, as they did to those 
including L,.” 

There are, moreover, several experiments which 
have shown that the discrepancy between the data of 
Wagner et al. and those of Thomas et al. cannot be 
resolved by pointing to differences in their procedures 
for measuring control. Even if the control exercised 
by an incidental stimulus is assessed by varying some 
aspect of it in a generalization test, it is possible to 
confirm the finding by Wagner et al. that TD training 
may reduce, rather than enhance, such control. What, 
then, is the basis for this difference in outcome? Per- 
haps the most obvious difference between the studies 
by Wagner et al. and Thomas et al. is that in each of 
their experiments Wagner et al. employed either an 
instrumental or a classical discrete-trial procedure, 
while the studies by Thomas et al. and subsequent 
replications of their results have all employed free 
operant procedures. Turner and Mackintosh (1972) 
first suggested that this might be an important factor 
and presented data showing that pigeons given dis- 
crete-trial discrimination training might show a flatter 
gradient than a PD group around a stimulus common 
to positive and negative trials. Gray and Mackintosh 
(1973) confirmed this result. They trained pigeons on 
a series of discrete trials to peck a key illuminated 
with a vertical line on all trials. For TD birds, posi- 
tive trials were signaled by a tone and negative trials 
by white noise; for PD birds the tone and noise each 
signaled reinforcement on 50% of trials. The results 
of generalization tests, conducted in silence, to other 
orientations of the line are shown in Figure 9. It can 
be seen that in this experiment all measures agree in 
showing greater control by the line in PD animals 
than in TD animals: the PD group responded at a 
higher rate to the vertical line and also showed 


N. J. Mackintosh 


ABSOLUTE RELATIVE 


50 


RESPONSES 


MEAN NUMBER OF RESPONSES 


PERCENT OF TOTAL 


LINE ORIENTATION 


Fig. 9. Absolute and relative gradients of line-orientation gen- 
eralization in pigeons following TD or PD training using a 
discrete-trial procedure. (After Gray & Mackintosh, 1973.) 


steeper absolute and relative gradients along the 
dimension of line orientation. 

It is clear, therefore, that regardless of the pro- 
cedure used to assess control, discrimination training 
in a discrete-trial situation may decrease, rather than 
increase, the control gained by an incidental stimulus 
common tO positive and negative trials. Thomas’; re- 
sults are of less generality than he has supposéd., 


ANALYSIS OF THE EFFEGTS OF 
EXTRADIMENSIONAL TRAINING 


It the use of free operant or discrete-trial pre- 
cedures is the critical variable determining the effects 
of discrimination training on control by incidental 
stimuli, it remains to attempt some interpretation of 
this difference. There are numerous differences be- 
tween the two procedures. Which are the important 
ones, and how do they come to affect the outcome of 
these experiments? 

In free operant discriminations, responses to S+ are 
usually reinforced on  variable-interval schedules, 
typically on VI 1-min schedules in the experiments of 
concern here. In discrete-trial discriminations, on the 
other hand, reinforcement is typically available on all 
S+ trials. It is possible that this marked difference in 
the schedule of reinforcement associated with S+ has 
an important effect on the experimental outcome. If 
it is accepted that discrimination training suppresses 
control by incidental stimuli by insuring that other 
stimuli are relatively more valid signals of reinforce- 
ment, the magnitude of this effect will necessarily 
depend on the difference in the validity of incidental 
and discriminative stimuli. The more precisely the 
presence of a discriminative stimulus signals the avail- 
ability of reinforcement, the more successfully it will 
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overshadow an incidental stimulus. The S+ of a dis- 
crete-trial discrimination, signaling the immediate 
delivery of reinforcement on every trial, is surely a 
better predictor of reinforcement than is an S+ asso- 
ciated with a variable-interval schedule in a free oper- 
ant discrimination. ‘Thus TD training should be more 
effective in suppressing control by incidental stimuli 
in typical discrete-trial studies than in typical free 
operant experiments. There is, unfortunately, no 
evidence to support or refute these speculations. It 
Should be noted, moreover, that although it may ex- 
plain the direction of the difference between discrete- 
trial and free operant experiments, this suggestion 
will not explain why TD training in a free operant 
experiment should actually increase the control dis- 
played by an incidental stimulus. 

In the preceding section, the argument was ad- 
vanced that discrimination training might increase 
control by stimulus 4 because it suppressed control by 
another stimulus B, which would otherwise have over- 
shadowed or masked control by A, Following this 
line of reasoning, Turner and Mackintosh (1972) 
argued that the repetitive nature of responding on 
typical free operant schedules might provide a source 
of stimuli that came to control behavior and sould 
therefore mask contrel by an incidental stimulus in 
a PD group. If these response-produced stimuli lost 
control as a consequence of TD traiftiing, an extero- 
ceptive incidental stimulus might be able to exercise 
more control over responding. Even if discrimination 
training resulted in some overshadowing of this inzi- 
dental stimulus by the relevant discriminative stimull, 
it would also suppress control by response-produced 
stimuli, and this latter unmasking effect might be 
even more important. 

A study by Hall and Honig (1974) provides some 
support for this suggestion. They first gave pigeons 
TD or PD training with red and green overhead 
lights serving as discriminative stimuli and then rein- 
forced them for pecking at a set of vertical lines on 
the response key, before finally testing for generaliza- 
tion to other orientations of the line. One pair of TD 
and PD groups had been required to peck the key 
during initial TD and PD training, and these groups 
confirmed the results of Thomas et al., in that initial 
TD training enhanced control by the lines. The 
second pair of groups, however, had received free 
reinforcement not contingent on key pecking dur- 
ing their initial exposure to the TD or PD schedules. 
‘These groups, after being subsequently autoshaped to 
peck the vertical line and then given VI reinforcement 
for several sessions, showed no difference whatsoever 
in the slope of their generalization gradients when 
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tested with other orientations of the line. The only 
obvious difference between the two pairs of groups 
was that where I'D training increased control by the 
lines, animals were repetitively pecking at a response 
key during exposure to the TD or PD schedule. When 
this one factor was changed, there was no suggestion 
of an enhancement effect. 

‘These results are certainly consistent with the idea 
that ‘TD training may enhance control by some set of 
exteroceptive stimuli only to the extent that such 
training suppresses control by stimuli associated with 
repetitive instrumental responding. The implication 
is that such response-produced stimuli, if not neutral- 
ized ‘by discrimination training, may mask or even 
overshadow control by an exteroceptive stimulus. 
There is, of course, nothing novel in the suggestion 
that free operant schedules of reinforcement may en- 
able stimuli associated with the repetitive nature of 
responding to gain control over the subject’s behavior. 


Under a variable interval schedule of reinforce- 
ment, for example, the organism often responds 
at a nearly constant rate for long periods of 
time. All reinforcements therefore occur when 
it is responding at that rate, although this condi- 
tion ts not specified by the equipment. The rate 
becomes a discriminative and, in turn, a rein- 
forcing stimulus, which opposes any change to a 
different rate. (Skinner, 1966, p. 25) 


The point has been documented by Blough’s (1963) 
studies of pigeons trained to peck a key illuminated 
with a given wavelength on variable-interval and 
differential reinforcement of low rate (DRL) sched- 
ules. ‘The probability of pecking within a very brief 
interval of a preceding peck was found to be essen- 
tially unaffected by changes in the wavelength pro- 
jected onto the key. Such pecks, Blough concluded, 
were controlled more by the occurrence of preceding 
pecks than by any exteroceptive discriminative stim- 
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ulus. Both Hearst (1969) and Ray and Sidman (1970) 
have gone so far as to argue that free operant sched- 
ules of reinforcement may be inappropriate for the 
study of exteroceptive stimulus control. The probabil- 
ity of initiating a response (as in a discrete-trial pro- 
cedure) may be a better measure of the role of extero- 
ceptive discriminative stimuli than is the probability 
of continuing to respond. 

Turner and Mackintosh’s argument required only 
that some set of incidental stimuli, more prevalent 
in free operant than in discrete-trial procedures, 
should be responsible for the masking of control by a 
specific incidental stimulus in PD subjects. Whether 
or not these masking stimuli are a product of repeti- 
tive responding is still open to question. A further set 
of results, however, confirms that much of the effect 
observed in the experiments of Thomas et al. is un- 
doubtedly a consequence of such masking of control 
in the PD group (Honig, 1969, 1974; Turner & Mack- 
intosh, 1972). The design of Turner and Mackintosh’s 
experiment is shown in Table 4. Pigeons initially re- 
ceived TD or PD training between different key 
colors signaling variable-interval and extinction sched- 
ules, with a vertical line on the key on all trials. In 
the control condition, TD or PD groups were subse- 
quently reinforced for responding to a plain red key 
and were then tested for generalization to other orien- 
tations of the line on a black background. As in other 
studies, the TD group showed the steeper gradient. 
In the experimental condition, however, both TD and 
PD subjects received several sessions of TD training 
with a new pair of wavelengths and with no line on 
the key, before being tested for generalization along 
the dimension of line orientation. In these groups, 
there was no difference between the gradients of sub- 
jects initially given TD training and those given PD 
training. Additional TD training on a new pair of 
stimuli increased the control displayed by the line in 
PD subjects to the point where their generalization 


Table 4 Design of Experiment by Turner and Mackintosh (1972) 


GROUP STAGE lI STAGE 2 TEST 
TD BV+, GV—- Orientation 
Control Rt+ 
PD BVx+, GV Generalization 
TD BV+, GV- Orientation 
Experimental Rees y= 
PD BV+, GV= Generalization 


B = Blue; G = Green; R = Red; Y = Yellow: V = Vertical line 
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gradient was indistinguishable from that of TD 
subjects. 

That TD training with a new pair of stimuli can 
sharpen, for PD subjects, the gradient around a pre- 
viously presented incidental stimulus implies that the 
typically flat gradient produced by PD subjects in free 
operant experiments cannot be a consequence of any 
failure of that incidental stimulus to acquire control. 
It must be due to a masking of such control by 
stimuli, which can then be neutralized by subsequent 
discrimination training. Honig (1969, 1974) has 
demonstrated the converse of these results. Just as 
subsequent TD training can sharpen the gradient of 
PD subjects, so subsequent PD training can flatten the 
gradient of TD subjects. Honig also showed that this 
flattening can itself be reversed by further TD train- 
ing. Ihe generality and reliability of these results, 
therefore, leave little doubt that at least part of the 
difference between TD and PD gradients observed in 
the experiments of Thomas ct al, and in subsequent 
experiments represents nothing more than the mask- 
ing of control in PD subjects by other irrelevant 
stimuli. 

Is this a sufficient account of the effects of extra- 
dimensional discrimination training? Thomas (1970), 
as wé noted earlier, has insisted that it is mecessary to 
appeal to a process of general attentiveness, brought 
into play by discrimination training, which can affect 
control by all stimuli. An inerease in general atten- 
tiveness will increase the control acquired net only by 
relevant stimuli and subsequently presented stimuli, 
but also by incidental stimuli, present but irrelevant 
during the course of discrimination learning. It is im- 
portant to sce whether recourse to such an analysis is 
necessitated by the data. 

There is good reason to question whether Thomas’s 
analysis is sufficient to account for the data we have 
been considering. It is clear that discrimination train- 
ing in discrete-trial situations does not increase con- 
trol by incidental stimuli. It is equally clear that the 
effect observed in free operant experiments is due not 
so much to an increase in the control actually ac- 
quired by the incidental stimulus as to an increase in 
the probability that the control acquired by such 
stimuli during training will in fact be displayed in the 
test situation. It is possible, nevertheless, that discrim- 
ination training does have the sort of general effect 
postulated by Thomas, in addition to the more 
specific effects suggested here, and that these general 
effects serve to counteract more selective processes. 
The evaluation of this possibility must be a rather 
problematic affair. Some results obtained with pigeons 
by Gray, however, provide some support for such a 


505 


compromise by showing that discrimination training 
might not reduce control by a specific incidental stim- 
ulus, even if masking effects have been controlled. 
The present argument has been that discrimination 
training always tends to suppress control by relatively 
less valid stimuli but that this effect might not always 
be observed because such training has also suppressed 
control by other, potentially masking stimuli. It fol- 
lows that if discrimination training could somehow 
be given to PD subjects so as to neutralize these other 
masking stimuli, this basic overshadowing effect 
would become apparent. Gray (personal communica- 
tion) attempted to test this prediction by training 
pigeons to peck a vertical line superimposed on a blue 
or green background. For a TD group, the color of 
the background was correlated with the availability 
of réinforcement; for a PD group, no such cerrelation 
existed, For beth groups, however, these trials were 
interspersed with tials on which the key light was 
white (with no line), and responding was not rein- 
forced. Krom the outcet of the experiment, therefore, 
the “PD” group received discrimination traifiig, with 
the line on cslored backgrounds signaling occasional 
réinforcement and a plain white key signaling nen- 
reinforcement. Since the line was the best single pre: 
dictor of reinforcement, it should have acquired 
strong control over responding. In the TD group, 68 
the other hand, the color of the key was an ¢y¢n more 
rehable predictor of reinforcement and. by the prin- 
ciple of relative validity, should have overshadowed 
the line. ‘There was no suggestion of such an affeet: 
the slope of the linc-iilt gradient was, indeed, margin- 


ally stecper in the TD group than in the PD group. 


DISCUSSION 


The argument to this point has been complex and 
possibly tortuous. Before attempting to assess the fur- 
ther theoretical implications of this argument, there- 
fore, it may be as well to recapitulate briefly the main 
outline. 


Recapitulation 


If a change in a particular stimulus results in a 
correlated change in a subject’s behavior, we may say 
that this stimulus controls the subject’s behavior. In 
any situation, it is obvious some stimuli will gain con- 
trol more rapidly than others, while yet other stimuli 
may apparently fail to gain control. We may cate- 
gorize these differences as differences in salience. 

The simplest possible account of stimulus control 
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would be to say that the reinforcement of a particular 
response in the presence of a given set of stimuli will 
insure that all those stimuli will gain control over 
that response at a rate determined by their salience. 
We have seen that this is an oversimplification: a 
stimulus may gain control without necessarily display- 
ing that control in a particular test situation. This is 
often because a change in one feature of the situa- 
tion leaves other controlling features unchanged: the 
responding maintained by these latter features may 
prevent the experimenter from detecting the control 
gained by the former. The unchanged features serve 
to mask control, which can readily be detected by 
testing in their absence. This masking effect is of no 
great theoretical significance in itself>; its importance 
lies in the fact that the experimenter may fail to 
recognize its presence and thus fail to appreciate the 
reasons for apparent failures of stimulus control. It is 
clear, for example, that if a pigeon’s key pecks are 
nondifferentially reinforced in the presence of a tone, 
the tone is well able to acquire control over respond- 
ing. The absence of control by the tone may simply be 
a consequence of masking by the key light. It is also 
probable that responding on most free operant sched- 
ules of reinforcement is at least partially under the 
control of previous responses, which may mask con- 
trol by exteroceptive stimuli. There is indeed direct 
evidence that this is true of some of a pigeon’s key 
pecks. Experimenters need to remind themselves that 
the experimental situations in which they place their 
subjects contain a multiplicity of features, and that a 
failure to detect control by one feature may reflect 
nothing more than the control gained by others. 

A more Salient stimulus might not only mask the 
expression of control by another, it might also prevent 
the less salient stimulus from acquiring control in the 
first place. This principle of overshadowing also ap- 
plies to the case where stimuli differ in validity rather 
than in salience: a stimulus which is reliably corre- 
lated with the occurrence of reinforcement may pre- 
vent one less well correlated from acquiring control 
over responding. Whenever discrimination training is 
programmed, constant features of the experimental 
situation become less well correlated with reinforce- 
ment than the discriminative stimuli. The principle 
of overshadowing, therefore, implies that the discrim- 
inative stimuli will prevent these situational stimuli 
from acquiring control, and this may explain the 
effect of discrimination training on stimulus control. 
Nondifferential reinforcement in the presence of a 


5 This is, perhaps, an exaggeration. An adequate explanation 
of masking requires at least some assumptions about the effects 
of adding together different sources of response strength. For one 
set of possible assumptions, see Hull (1952, p. 66). 
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tone, for example, will produce little control by the 
tone, because other, more salient stimuli, such as the 
key light, will either mask or overshadow the tone. 
Differential reinforcement between the presence and 
absence of the tone, however, will suppress control by 
the key light and other situational stimuli and pre- 
vent their masking the tone. Whenever animals are 
exposed to differential reinforcement, therefore, 
stimuli correlated with these changes in reinforcement 
will tend to acquire control over behavior, and in so 
doing will prevent other less valid stimuli from gain- 
ing control, even if these other stimuli are intrinsi- 
cally more salient. Since these other stimuli may in- 
clude such features as the shape and size of the 
apparatus, a background masking noise, the time of 
day, or the occurrence of previous responses, it is not 
surprising that differential reinforcement should have 
such pervasive effects and should so reliably enhance 
the control gained by discrete features of the experi- 
mental situation, such as stimuli on a pigeon’s re- 
sponse key. 

Whether this principle is sufficient to explain all 
effects of discrimination training on stimulus control 
is still an open question. In some situations, discrim- 
ination training, so far from enhancing control by 
relevant stimuli only at the expense of irrelevant 
stimuli, appears to enhance control by the latter also. 
It may be possible to reconcile this observation with 
the principle of overshadowing by arguing that dis- 
crimination training suppresses control by other 
irrelevant stimuli which might mask the control 
gained by the particular incidental stimulus manip- 
ulated by the experimenter. It is also possible, how- 
ever, that discrimination training has additional, more 
general effects on stimulus control. 


Theoretical Analysis 


The argument throughout this chapter has been 
theoretical in the sense that I have not listed a set of 
empirical conditions known to affect the slope of gen- 
eralization gradients, but have rather attempted to 
reach an understanding of the principles governing 
those effects. The principles invoked, however, them- 
selves stand in need of explanation. In particular, it 
is important to consider how overshadowing and gen- 
eral attentiveness can be incorporated into theoretical 
analyses of learning. 


OVERSHADOWING 


There can be no gainsaying the fundamental im- 
portance of overshadowing. The presence of a more 
salient or more valid stimulus is apparently able to 
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interfere with the acquisition of control by less salient 
or less valid stimuli. This observation contradicts the 
basic assumption, common to most _ traditional 
theories of learning, that all stimuli present at the 
moment of reinforcement will gain control over be- 
havior, with any differences in the control acquired by 
different stimuli being a consequence of their own 
salience or absolute validity. The occurrence of over- 
shadowing implies, on the contrary, some interaction 
or competition between stimuli for control of be- 
havior. 

One possible explanation of overshadowing is to 
appeal to what Thomas (1970) has called an “inverse 
hypothesis,” ‘The assumption is that there is a fixed 
upper limit to the control that can be gained by any 
set of stimuli, and that this must be shared between 
all stimuli present in the experimental situation. The 
preater the control acquired by one subset of the avail- 
able stimuli, therefore, the less will be available for 
others. One expression of this inverse hypothesis is to 
be found in theories of selective attention (Lovejoy, 
1968: Sutherland & Mackintosh, 1971), which assume 
an inverse relation between the probabilities or 
strengths of attention to different scts of stimuli, The 
term attention is here used not in the sense employed 
by Honig (1970) to delimit a class of experimental 
operations, nor yet in the sense employed by Terrace 
(1966) to refer to residual variability in experimental 
data. The meaning of the term is quite precisely de: 
fined, by its use in a formal model, to refer to a 
parameter whose value determines first the ameunt of 
change in associative strength of a stimulus as a conse- 
quence of reinforcement and nonreinforcement and, 
secondly, the extent to which the subjects’ behavior 
will be actually controlled by that stimulus rather 
than by another at any particular moment. 

The inverse hypothesis may also be derived from a 
theory of associative competition (Rescorla & Wagner, 
1972; Revusky, 1971). Following a suggestion of 
Kamin (1969), Rescorla and Wagner have argued that 
changes in the associative strength of one component 
of a compound depend upon the current associative 
strength of all other components. ‘This will insure that 
the asymptotic associative strength of one component 
of a reinforced compound will be inversely related to 
the asymptotic strength of other components. Over- 
shadowing of an auditory by a visual component is a 
consequence of the fact that the visual component, by 
virtue of its greater salience or better correlation with 
reinforcement, acquires associative strength more 
rapidly than the auditory component and thus re- 
duces the associative strength available for condition- 
ing to the latter. 

Overshadowing seems a sufficiently reliable phe- 
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nomenon to require some version of the inverse hy- 
pothesis for its explanation. It is not possible, as has 
been attempted by Thomas (1970), to dismiss all 
instances of overshadowing or blocking as artifacts 
susceptible of alternative explanation. Nevertheless, 
Thomas’s point is worth serious consideration, even 
if not always for the reasons he advances. There have, 
in fact, been several reported failures of overshadow- 
ing, both in instrumental learning (e.g., Sutherland & 
Andelman, 1967) and in classical conditioning (e.g., 
Schnur, 1971). Mackintosh (1974), indeed, has sug- 
gested that on the basis of the available evidence 
overshadowing requires either marked differences in 
salience between overshadowing and overshadowed 
cues or a difference in their correlation with rein- 
forcement. A weak or poorly correlated stimulus will 
be overshadowed by another more salient or more 
valid one, but it will not itself detract from cenditign- 
ing to the stronger stimulus; nor indeed is there much 
evidence that two salient or equally correlated stimuli 
will overshadow one another. If thig conclusion is 
substantiated by further research, it will require some 
modification of the rather rigid interpretation ef the 
inverse hypothesis postulated by theories of selective 
attention or limited associative capacity. 

Before leaving the topic of overshadowing, it is 
worth showing that it provides a simple explanation 
of one result which has not been considered so far. 
We have documented the finding that in frea operant 
contrel ever responding following TD training than 
after PD training. There is good evidence that a 
major part of this différence is due to the difference 
between the TD group and subjects receiving simple, 
nondifferential reinforcement, even if thig ig pro- 
grammed in an unchanging stimulus situation (cag. 
Bresnahan, 1970; Honig, 1969; Turner & Mackintosh, 
1972). We have not considered whether PD training 
would produce any different level of control from that 
produced by simple nondifferential reinforcement. 
‘he question is whether, for example, nondifferential 
reinforcement of a pigeon’s key pecks in the presence 
of a vertical line and an unchanging wavelength 
would result in stronger control by the line than 
would the same schedule of reinforcement in the 
presence of the line, but with random variations in 
wavelength. Although some studies have suggested 
that these two procedures may have relatively similar 
effects on control in such a situation (e.g., ‘Thomas et. 
al., 1970, Experiment 5), there are other studies which 
have rather clearly shown that the PD treatment may 
result in significantly flatter generalization gradients 
(Bresnahan, 1970; Honig, 1969, 1974; Tomie, Davitt, 
& "Thomas, 1973). 
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Tomie et al. have inferred from this difference that 
animals learn to be inattentive to all stimuli when ex- 
posed to variations in one set of stimuli uncorrelated 
with variations in reinforcement. It is equally possible, 
however, to interpret such a result in terms of Res- 
corla and Wagner’s (1972) account of overshadowing. 
If the flat gradients of PD subjects are a consequence 
of control by some set of masking stimuli, then the 
more control such stimuli acquire, the flatter these 
gradients will be. But if stimuli compete for the acqui- 
sition of control, then the control acquired by mask- 
ing stimuli must be inversely related to the control 
acquired by any other stimuli explicitly manipulated 
by the experimenter. A single constant wavelength 
will acquire control more rapidly than will a ran- 
domly varying pair of wavelengths. In the former case, 
therefore, potential masking stimuli will have less 
chance to acquire control and will be less able to mask 
control by other stimuli subsequently varied in a gen- 
eralization test. 


GENERAL ATTENTIVENESS 


A large part of Thomas’s argument against the 
inverse hypothesis was based on the assertion that 
evidence for a process of selective attention is far out- 
weighed by evidence for a process of general attentive- 
ness. Although this claim may be exaggerated, there 
are sufficient problems with attempts to attribute all 
effects of discrimination training to the overshadow- 
ing of incidental stimuli so that it becomes important 
to see whether any more precise characterization can 
be provided of this concept of general attentiveness. 
What are the factors which operate to prevent the 
relevant stimuli of a discrimination problem from 
overshadowing, and thus reducing control by, inci- 
dental stimuli common to both positive and negative 
trials? A simple suggestion, which should be consid- 
ered, even if it must also in the end be dismissed, is 
that discrimination training establishes a set of observ- 
ing or orienting responses which increase control by 
other stimuli located in proximity to the discrimina- 
tive stimuli. Pigeons trained to discriminate between 
two wavelengths projected onto a response key will 
learn to look at the key and will thus learn about any 
other stimuli, such as a line, that also appear on the 
key. Venerable as this suggestion may be, in the 
present context it is neither plausible nor empirically 
substantiated. It is hard to see how, even in the ab- 
sence of explicit discrimination training, a pigeon 
could learn to direct responses at a key without look- 
ing at it. Moreover, several studies of extradimen- 
sional transfer have used stimuli not located on the 
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response key, without this having any apparent effect 
on the outcome of the experiment. In one experiment, 
Thomas et al. (1970) gave pigeons TD or PD training 
with different floor tilts serving as the discriminative 
stimuli and found that TD training resulted in reli- 
ably better control by the wavelength projected onto 
the key. Hall and Honig (1974) found differences in 
the control gained by a set of lines projected onto the 
response key after pigeons had received TD or PD 
training between different overhead lights. Conversely, 
Gray (personal communication) has shown that if 
pigeons are given TD or PD training with different 
wavelengths, with a tone present on all trials, TD sub- 
jects will show better control by the tone in a subse- 
quent generalization test. 

Thomas (1970) noted that results such as these 
ruled out the possibility that enhancement could be 
simply due to the establishment of appropriate observ- 
ing responses and concluded that “a central atten- 
tional mechanism seems required” (p. 327). If subjects 
are exposed to a correlation between changes in 
stimuli and changes in reinforcement, the assumption 
seems to be that they will come to expect that future 
changes in stimulation will also be correlated with 
changes in reinforcement. There is a certain plausibil- 
ity to this idea, but it should be recognized as no more 
than an intuitive and disturbingly vague proposal. 
Certainly no attempt has yet been made to specify a 
formal model with the required properties. Until such 
a day, it may be worth seeing whether more prosaic 
processes are sufficient to account for the observed 
data. 

A more neutral description of those data may help 
to suggest one possible set of ideas. Extradimensional 
enhancement is said to occur if subjects, given a gen- 
eralization test after TD training between two wave- 
lengths with a vertical line always present, respond 
relatively infrequently to other orientations of the 
line, whereas subjects given PD training continue to 
respond at a relatively constant rate regardless of the 
orientation of the line. If subjects are to show good 
control by the stimuli varied in a generalization test, 
they must either never respond, or rapidly stop re- 
sponding, to the majority of test stimuli. Since it is 
only rarely that subjects do not respond at all to any 
test stimulus, one determinant of the slope of most 
generalization gradients will be the rate at which re- 
sponding extinguishes to such stimuli. This point is 
documented by the observation, noted earlier, that 
generalization gradients become progressively steeper 
in the course of testing in extinction. 

It may therefore be parsimonious to attribute at 
least some of the difference between TD or PD gra- 
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dients to a simple difference in the resistance to extinc- 
tion engendered by the two schedules. There is, in 
fact, independent evidence that discrimination train- 
ing may result in the more rapid extinction of re- 
sponding to S+ than does partial reinforcement in the 
presence of that stimulus (Jenkins, 1961), and several 
reasonably well-defined theoretical analyses are avail- 
able to explain such an effect. Amsel’s theory of par- 
tial reinforcement and persistence (Amsel, 1962, 1972) 
is readily applied to the present set of data. Animals 
exposed to a PD schedule may be assumed to have 
learned that responding in the face of occasional 
periods of nonreinforcement will be later reinforced. 
Responding will therefore persist in the fact of a series 
of nonreinforced test trials. After IT'D training, on the 
other hand, subjects may learn that responding dur- 
ing periods of nonreinforcement will not be rein- 
forced, and they will therefore stop responding during 
a generalization test. Although Amsel has not applied 
his analysis specifically to the case of generalization 
tests, he has presented an account of discrimination 
learning which implies that prior exposure to a sched- 
ule of partial reinforcement will retard the acquisition 
of a successive discrimination by retarding the extinc- 
tion of responses to S~ (Amsel & Ward, 1965). Amsel 
and Ward’s original analysis assumed that such effects 
would be confined to situations where the discrimina- 
tive stimuli remained unchanged between discrim- 
ination training and prior exposure to partial rein- 
forcement. Several subsequent experiments, however, 
have shown that this is incorrect and that differences 
between partial reinforcement and cither consistent or 
differential reinforcement transfer virtually without 
loss even when the discriminative stimuli are changed 
(e.g., Flaherty & Davenport, 1972; Galbraith, 1973). 

Such transfer is readily explained by an extension 
of the analysis applied to the “generalized partial re- 
inforcement effect’ (Amsel, Rashotte, & MacKinnon, 
1966; Brown & Logan, 1965). Animals exposed to 
partial and consistent reinforcement in different alleys 
respond persistently over a series of extinction trials 
even if extinction is conducted in the previously con- 
sistently reinforced alley. Although animals that re- 
ceive only consistent reinforcement in one alley 
rapidly stop running if extinguished in that alley, ex- 
posure to partial reinforcement in another situation is 
sufficient to insure persistent responding in extinction. 
Amsel refers to this finding as-a case of mediated gen- 
eralization. ‘The events that control behavior in ex- 
tinction are assumed to be stimuli arising from the 
delivery or omission of reinforcement: animals rein- 
forced for responding persistently in the face of non- 
reinforcement in one situation will continue to do so 
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when exposed to nonreinforcement in a new situa- 
tion.§ 

In this analysis, the specific discriminative stimuli, 
in whose presence animals receive partial or consistent 
reinforcement, are viewed as providing a context in 
which habits of persistence or its converse are estab- 
lished. ‘Thus examples of extradimensional transfer of 
TD or PD training are explained not by saying that 
animals attribute significance to external stimuli and 
generalize this to other stimuli, but rather by saying 
that limited changes in contextual stimuli may not 
disrupt patterns of behavior that have previously been 
established under particular conditions of reinforce- 
ment. It seems probable that there must be some 
limits to this generalization. There may be better 
transfer of persistence to a nonreinforced generaliza- 
tion test, if animals have been partially reinforced in 
the presence of uncorrelated variations in external 
stimuli, than if such training has been given in an un- 
changing stimulus situation. This would provide an 
alternative explanation of the previously noted find- 
ing that PD training may result in even flatter gen- 
eralization gradients than exposure to a comparable 
schedule of reinforcement in the presence of a single, 
unchanging stimulus. 


Concluding Comments 


It is worth stressing one point in conclusion. What- 
ever the merits of rival explanations of stimulus selac- 
tion and overshadowing on the ence hand, and of rival 
explanations of general attentiveness on the other, the 
distinction between the principles of general and 


selective changes in controlling stimuli may be diffi: 
cult to maintain both in fact and in logic. ‘Io say that 
discrimination training produces a “set to discrim- 
inate,” which encourages differential responding to 
all stimulus dimensions, may seem to imply some 
theoretical assumptions quite distinct from those im- 


6 Amsel himself has always assumed that the subject’s emo- 
tional reactions to nonreinforcement provide the source of 
stimulation which controls persistent behavior. This identifica- 
tion does not affect the logic of the argument. In discrete-trial 
situations, where responding on one trial is reinforced after a 
preceding nonreinforced trail, it may be as appropriate to assume 
that the memory of such an outcome, rather than the frustration 
conditioned by that outcome, comes to control behavior (Capaldi, 
1967). In the present context, it might be more appropriate to 
make some entirely different identification. The important point 
is that a free operant PD or nondifferential schedule reinforces 
a steady rate of responding, even after relatively prolonged 
periods of nonreinforcement. As soon as TD subjects learn the 
discrimination, they do not receive reinforcement after respond- 
ing through such prolonged periods of nonreinforcement. It 
does not matter how such periods of nonreinforcement are 
detected. 
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pled by theories of overshadowing. It is, however, 
possible to argue that evidence of a general effect of 
discrimination training is always a consequence of the 
suppression of control by other stimuli which were 
not measured by the experimenter. 

In the preceding section, an analysis of general 
attentiveness was proposed which suggested that PD 
subjects show less control by stimuli varied in a gen- 
eralization test than do TD subjects, because they 
have been reinforced for responding at a steady rate 
through periods of nonreinforcement and continue to 
do so during testing. In effect, this amounts to saying 
that they show relatively little control by the test 
stimuli, because their behavior is controlled by other 
events. ‘his analysis, therefore, would seem to be a 
special case of the general proposition advanced 
earlier: procedures which increase the control exer- 
cised by one set of stimuli achieve their effects by de- 
creasing the control exercised by others. If, as a conse- 
quence of discrimination training, subjects are ready 
to attribute changes in reinforcement to changes in 
external stimuli, as Thomas has suggested, this may 
simply be because they have learned that such changes 
are not dependent on the time of day, their own 
motivational state, the occurrence of preceding re- 
sponses or reinforcers, or yet other events which the 
experimenter has not even suspected. 
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Animal Psychophysics* 


INTRODUCTION 


This chapter concerns the assessment, by operant 
methods, of the sensory and perceptual capacities of 
nonverbal animals. Though behavioral methods have 
been used for such purposes since early in the century, 
modern operant techniques have markedly increased 
their efficiency. Many methodological variations have 
appeared; instrumentation has been vastly improved, 
and the number of species and problems studies has 
multiplied. Experimenters have successfully answered 
questions that would have been out of reach just a 
few years ago, and further rapid development is in 
prospect. 

This burst of activity in animal psychophysics has 
been prompted in part by the rapid expansion of re- 
search on the anatomy and physiology of sensory proc- 
esses. For such experiments, nonhuman organisms 
have, of course, been popular subjects. The result- 
ing findings have called for corresponding psycho- 


* Preparation of this chapter was partially supported by 
USPHS grant MY-02456. We thank Dr. Charles Shimp and the 
Psychology Department, University of Utah, for providing the 
facilities that we used during the writing of this chapter. 
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physical data from the same species, because a com- 
plete understanding of sensory functioning within a 
species requires interlocking anatomical, physiological, 
and behavioral data. A broad understanding of sen- 
sory functioning also requires cross-species compari- 
sons. For example, when systems differ character- 
istically in structure, psychophysical data may reveal 
corresponding functional differences. Already such 
correspondences have contributed a good deal to our 
understanding of the way sensory systems work, and 
they suggest fascinating possibilities for future re- 
search. 

This chapter will focus on research and methods 
that seek information regarding sensory and _ percep- 
tual systems. Problems in some related areas of stim- 
ulus control, although they are sometimes considered 
part of “animal psychophysics,” are treated elsewhere 
in this volume, and we shall not dwell on them. We 
shall stress studies that seek a functional relationship 
between a carefully defined response and a carefully 
controlled stimulus dimension, and our emphasis will 
be on methods that have been successful in describing 
such relationships. ‘Thus more consideration will be 
given to studies that show how a response measure 
varies across a range of stimulus values than to studies 
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that simply inquire about an animal’s ability to dis- 
criminate between two values. ‘The more limited ex- 
periments have helped in the development of be- 
havioral techniques, and have been useful in answer- 
ing such questions as “Do cats have color vision?” 
However, detailed functional relationships may an- 
swer not only the question “Do theyr” but also “What 
kind?” and “How much?” 

Although this chapter will concentrate on operant 
techniques, it is well to remember that other be- 
havioral methods, involving reflexes or classical ccn- 
ditioning, may best serve particular purposes. Indeed, 
for some species, operant methods simply may not 
work, In the frog, for example, psychophysical data 
are badly needed to supplement rapidly mounting 
data on visual anatomy and physiology, but as yet 
frogs have proven refractory to positive reinforcement 
techniques, 

Within the operant paradigm, the choice of a 
method often hinges on the achievement of maximal 
stimulus control and simple, quantitative indices of 
this control, This means that operant methods favored 
for other purposes may be inappropriate in psycho: 
physical research. For example, the rate of a free oper- 
ant as exhibited on a cumulative record will rarely be 
seen in studies cited here: many experiments use dis- 
crete-trial methods in some ways resembling the 
mazes and jumping stands of earlicr decades, How- 
ever, the modern use of older methods involves a 
more thorough analysis of the situation than did 
earlier applications. Extrancous cues arising from 
many sources are carefully eliminated; competing 
sources of control are minimized, and appropriate 
orienting and fixating responses are often an integral 
part of the behavioral picture. 

In the following pages we shall consider first some 
techniques used to measure sensory thresholds in 
animals and some approaches to the many problems 
that arise in this arca. We shall next deal with percep- 
tual and scaling studies. ‘his research involves supra- 
threshold stimuli that raise special problems of stim- 
ulus control. Finally, we shall consider the role of 
signal detection theory in animal psychophysics, out- 
lining relevant methods and examining the applic- 
ability and the usefulness of this approach. 


MEASURING SENSORY THRESHOLDS 


In threshold studies, the psychophysicist seeks an 
index of his subject’s sensory capacity either in terms 
of the minimum perceptible stimulus strength or the 
minimum perceptible difference between two stimuli. 
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With nonverbal subjects, such studies usually begin 
by establishing a discrimination between very differ- 
ent stimuli. Threshold determination involves the 
examination of the discriminative response to a range 
of stimulus strengths, and the threshold itself is taken 
to be the stimulus value or difference that yields some 
arbitrary criterion of discriminability. 

‘The next two sections examine methods of achiev- 
ing the stimulus-response relationship upon which 
threshold determination rests. First, we consider vari- 
ous response measures and their associated reinforce- 
ment contingencies, and then we deal with methods of 
stimulus presentation, 


Response Measyres 


MetTuops [THat Usr A SINGLE 
Resronse ManiruLANDUM 


Single-response methods may be chosen for several 
reasons. They are behaviorally and conceptually 
simple; they are relatively casy te instrument; and 
they keep the subject in a reasonably constant posi- 
tion. We shall distinguish between “po /no—go” 
methods, where the stimulus dimension centrels a dis- 
criminated operant that is maintained by appropriate 
reinforcement contingencies, and the “conditioned 
suppression” methods, where the stimulus is paired 
with an aversive stimulus and controls the suppression 
of an ongoing operant. 


Go/fno—go Methods. To an operant psycholopict, 
an attractive form of go/no—go procedure might be a 
simple multiple schedule in which discrimination is 
assessed from the relative response rates during posi- 
tive stimulus ($+) and negative stimulus (8-) cendi- 
tions. Unless special precautions are taken, however, 
this method has at least two important defects: (1) the 
eccurrcncs of reinforcement in an 5+ period may act 
as a cue for responding, and (2) the appearance of the 
S*+ may act to reinforce responses in the preceding 5— 
period. A study by Raslear, Pierrel-Sorrentino, and 
Brissey (1975) shows how reinforcement cues may con- 
found stimulus effects when variable interval and 
extinction alternate in a multiple schedule (mult VI 
EXT). This research examined discriminations of 
auditory intensity in the chinchilla, using the absence 
of sound as the S+ and its presence, at varying inten- 
sities, as the S—. Since the design included unrein- 
forced as well as reinforced presentations of the St, 
the authors were able to assess the role of reinforce- 
ment as a cue. At large intensity differences, a measure 
based on unreinforced S+ periods showed about the 
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same discriminability as a measure based on rein- 
forced S+ presentations. As intensity difference de- 
creased, however, reinforcement cues appeared to as- 
sume more importance, so that the measure based on 
reinforced S+ presentations continued to show differ- 
ential responding even at very small intensity differ- 
ences. 

The go/no—go paradigm may also be used in 
discrete-trial methods that program very short stim- 
ulus presentations and measure the probability that a 
single response will occur on each trial. This prob- 
ability measure is often favored over response rate be- 
cause of its relative simplicity and because response 
rate appears to be affected by complex nonstimulus 
factors. Furthermore, reinforcement cues cannot con- 
found stimulus effects, since reinforcement terminates 
the trial. 

As a rule, go/no—go methods employ the traditional 
reinforcement contingencies; that is, they program re- 
inforcement of responses in the presence of the stim- 
ulus and their extinction in its absence. However, 
since subject chambers and manipulanda are usually 
designed with a view to facilitating the response, these 
procedures often favor excessive “false positive’ re- 
sponses. That is, subjects tend to err in the direction 
of responding when the stimulus is absent, rather than 
failing to respond when it is present. Signal detection 
theory has focused attention on these two types of 
error (“false alarms” and “misses”) and suggests that 
an imbalance between them indicates a “biased” 
subject. | 

An experiment by Terman and Terman (1972) 
shows how a go/no—go situation may be modified in 
an attempt to discourage response bias. In a study of 
auditory intensity discrimination in rats, these authors 
introduced symmetrical contingencies for the two types 
of error. The rats were required to press a nose key in 
the presence of a standard stimulus and to withhold 
the response in the presence of a comparison intensity. 
Time-outs followed both failures to press during the 
standard stimulus presentation and the occurrence of 
presses during the comparison intensities. Positive 
reinforcement was also symmetrical; that is, it fol- 
lowed both “hits” (key responses to the standard stim- 
ulus) and “correct rejections” (withholding the re- 
sponse during the comparison stimulus presentations). 
Despite these contingencies, Terman and Terman 
found that their subjects tended to be biased in the 
direction of responding, for false alarms were more 
probable than misses. The authors suggest that differ- 
ences in the topographies of the response for hits 
(“press”) and correct rejections (“don’t press”) might 
account for some of the bias. The nose key response 
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was more explicitly defined in terms of its topography 
and also differed in its temporal relation to signal on- 
set, since it could occur any time during the 3-sec 
stimulus presentation and still produce reinforcement. 

The discriminated avoidance response has also 
been used in psychophysical studies, especially with 
species that are not readily amenable to positive rein- 
forcement procedures. In go/no—go avoidance a re- 
sponse is maintained because it removes a stimulus 
signaling shock or some other aversive event. Varia- 
tions in the stimulus along some dimension yield 
corresponding changes in response probability, which 
describe a psychophysical function. A nice example 
of this procedure has been reported by Clack and Her- 
man (1963), who used it to obtain a set of auditory 
thresholds in the monkey. Their sessions were divided 
into trials, each Starting with the presentation of a 
white light. On some trials a tone followed the onset 
of this light; the tone signaled shock, and lever presses 
in its presence avoided the shock. A lever press in the 
presence of the tone was a hit, and was reinforced by 
shock avoidance. A press in the presence of the white 
light but in the absence of the tone was a false alarm 
and was punished by shock as, of course, were misses 
—failures to respond during the tone. A similar pro- 
cedure has been used by Saunders (1969) in a study of 
auditory intensity discrimination in the cat, a species 
known for its finicky attitude toward food reinforcers. 

Although the use of electric shock has been effec- 
tive, it has certain disadvantages. Some species— 
pigeons, for example—require wires that may inter- 
fere with the animal’s freedom of movement. More 
important is the possible disruptive effect of punish- 
ment on the subject’s behavior. Nonetheless, one com- 
parison between positive and negative reinforcement 
techniques (Sidley, Sperling, Bedarf, & Hiss, 1965) re- 
ports that, while the positive reinforcement method 
ytelded more “cooperative” subjects, similar spectral 
sensitivity functions were generated by the two pro- 
cedures. 


Conditioned Suppression Method. Under suitable 
conditions an aversive signal will suppress the rate of 
an operant response, and the degree of this suppres- 
sion may be used as a _ psychophysical response 
measure. ‘he method involves the superposition on a 
base line operant of signals that terminate after a 
minute or so with a shock or other aversive event. As 
a result of such pairings, the signal itself acquires sup- 
pressing properties. The amount of suppression is 
measured by the change in the baseline response rate, 
and it usually is stated in terms of a ratio composed 
of rates before and after the introduction of the 
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signal. A stable baseline is important, since pauses in 
the absence of the signal can be recorded as “false 
alarms.” Variable-interval and variable-ratio schedules 
are frequently used to maintain such a baseline. (For 
an extensive discussion of the suppression method and 
related data see Blackman, Chapter 12 in this 
volume.) 

The psychophysical function resulting from the 
suppression method describes the relationship be- 
tween some dimension of the aversive signal and the 
amount of suppression that the signal produces. Many 
such functions have been acquired by Smith (1970) 
and his colleagues for a variety of species and sensory 
modalities. One study describes olfactory sensitivity in 
the pigeon (Henton, 1969). Using as a baseline the 
key-pecking response maintained on a variable-inter- 
val schedule, delivery of controlled amounts of amyl 
acetate was paired with shock at the birds’ pubis 
bones. When a strong concentration of the odorant 
had acquired good suppressing properties, the concen- 
tration was gradually decreased. A psychophysical 
function related the ederant concentration to the sup- 
pression ratio. 

A series of studies on the auditory sensitivity of 
neurologically mutant mice (Ray, 1970; Sidman, Ray, 
sidman, & Klinger, 1966) examined the reliability and 
validity of the suppression method. This research 
produced such data as, for example, the finding that 
mice with inner-ear defects failed to suppress to audi- 
tory stimuli, although suppression to visual stimuli 
was still present. These studies support the reliability 
of the suppression method: the data replicated well 
within subjects, and agreement between subjects was 
also good. It should be noted. however, that while 
conditioned suppression is measured in terms of oper- 
ant response rate, the phenomenon is generally con- 
sidered an instance of classical conditioning (see 
Chap. 12). As with other differences between some of 
the methods we describe, we do not know the extent 
to which thresholds measured by suppression might 
differ from thresholds based on a discriminated oper- 
ant. Even were such comparisons done, they may be 
difficult to interpret, because obtained thresholds may 
depend on so many aspects of the procedures used to 
measure them. 


METHODs THAT USE Two or More 
RESPONSE MANIPULANDA 


In most human psychophysics, the subject is in- 
structed to give a specifically defined response on each 
trial—“yes”’ or ‘‘no,” “brighter” or “dimmer,” or the 
like—rather than to respond on some trials and to 
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withhold the response on others. A comparable pro- 
cedure with animals requires a manipulandum for 
each response alternative. An advantage of such 
symmetrical response requirements is that the experi- 
menter can then distinguish “failures to respond” 
from “incorrect responses” and is thus unlikely to 
confuse poor attention, “freezing,” and so forth with 
poor stimulus discriminability. Response bias may 
also be minimized to the extent that the alternative 
responses are comparable with one another and rein- 
forcement contingencies are symmetrical. 


Forced-choice Procedures. The most straightfor- 
ward of the multiresponse techniques is the two-re- 
sponse forced-choice method, used extensively in visual 
psychophysics, Here, each of two stimuli appear near 
their associated response manipulanda, and the ani: 
mal is reinforced for making the response correspond- 
ing to the “correct” stimulus, which might be defined 
as “the one that is illuminated,” “the ene that has 
the stripes,” or perhaps “the one that is the brighter.” 
The position of the correct stimulus should vary, of 
course, in such a manner that consistent responding to 
one of the manipulanda results in chance perfor- 
mance. Although this method is most frequently used 
for problems in vision, Wilson, Stamm, and Pribram 
(1960) successfully investigated tactual discrimination 
in monkeys by training their subjects to choose the 
coarser of two grades of sandpaper. 

Frequently, a third, “observing” response is added 
to the forced-choice method, as in a series of studies 
on poldfish color vision described by Yager and 
Thorpe (1970). These investigators used tanks that 
contained two response keys at one end and an b5b- 
serving key at the other. A press on the illuminated 
observing key presented the stimulus and cet up rein- 
forcement on one of two response keys: thé fish then 
swam, to the other end of the tank and made its choice 
response. In their work on spectral sensitivity, Yager 
and ‘Thorpe caused the observing response to illu- 
minate one of the two choice keys with monochromatic 
light, and a response to this lighted key brought food. 
Following the choice response was an intertrial inter- 
val during which the entire tank was illuminated so as 
to produce chromatic adaptation conditions desired 
by the experimenters. Using a similar method, P. 
Blough (1971) assessed pigeon visual acuity at con- 
trolled target distances. Here, one purpose of the ob- 
serving response was to force the bird to retreat a 
considerable distance from the target stimuli before 
each trial. Again, operation of an observing key at one 
end of the experimental chamber set up stimuli at the 
other end. One stimulus key was striped, the other 
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Fig. 1. An experimental chamber illustrating the use of an 
observing response combined with a two-key forced-choice task. 
To initiate a trial, the bird pecks the observing key, thus 
opening the shutter and illuminating the two stimulus keys. As 
the bird enters an alley leading to one of the two keys, it 
breaks a photocell beam, thus recording its choice and initiating 
the appropriate consequences. If the choice is correct, the 
shutter remains open until the bird pecks the stimulus key for 
food reinforcement. An incorrect choice closes the shutter at 
once. ‘The purpose of the hurdles is to help delay the choice 
response. (From P. Blough, 1971. © 1971 by the Society for the 
Experimental Analysis of Behavior, Inc.) 


blank, and pecks at the striped key produced rein- 
forcement. Figure 1 shows that a vertical partition 
and photocell arrangement forced the bird to make its 
choice while it was still at the far end of the chamber. 

Although it complicates the apparatus, a three- or 
four-response forced-choice procedure has desirable 
aspects. Additional choices extend the range between 
perfect and chance performance; for example, while 
chance performance is 50% for a two-response method, 
it falls to 25% for a four-response method. Three or 
more responses also allow the experimenter to ask 
that a subject choose the “different” stimulus. This 
more general paradigm may be useful for a variety of 
problems, at least for subjects capable of responding 
to the “difference” concept. DeValois and his col- 
leagues have collected psychophysical data on monkey 
color vision with the multiresponse procedure, illus- 
trated by a study of increment thresholds reported by 
Jacobs (1972). A tungsten source illuminated three 
response panels, and a monochromatic light was added 
to this background on one of the three panels. Squirrel 
monkey subjects were reinforced for responding to 
the panel with the added monochromatic light, what- 
ever its wavelength. Each of the three panels was 
correct equally often. By varying wavelength and in- 
tensity of the added light, Jacobs measured sensitivity 
under these conditions across a broad range of wave- 
lengths. 


“Yes-no” Procedures. Forced-choice procedures us- 
ing two or more keys may not be as subject to the 
effects of response bias as are less symmetrical para- 
digms. Unfortunately, successful use of forced-choice 
with animals probably necessitates the simultaneous 
presentation of two or more stimuli that can be ade- 
quately separated in space. When stimuli must occur 
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in succession, as in most auditory tasks, human sub- 
jects may be instructed to choose the earlier or later 
stimulus, but a corresponding task is probably too 
difficult for most nonhuman animals. For such prob- 
lems, the two-response ‘‘yes-no” task is a popular al- 
ternative. Here, a response to one manipulandum is 
reinforced if a specified condition is present (the ‘‘yes” 
response) and to another if the condition is absent 
(the “no” response). In this task, of course, response 
preferences may affect the data by altering the a 
priori ability of “yes” or “no” responses. A bias toward 
the “yes” key tends to drive the “threshold” down, 
while a bias toward the “no” key drives it up. Work 
of Irwin and Terman (1970) on auditory sensitivity in 
rats illustrates the use of this method. To start a trial, 
the rat positioned itself so as to break a photocell 
beam. This caused the lighting of two choice keys and 
also produced either noise alone or a tone added to 
noise. The animal was to press the left key for noise 
alone and the right key for noise plus tone. Electrical 
brain stimulation reinforced a correct response to 
either key; a time-out followed each error. One of 
the two rat subjects reached approximately 95% cor- 
rect responses when the tone signal was strong; the 
other rat’s best performance was about 85% correct. 
Despite the symmetrical reinforcement contingencies, 
both rats showed position preferences that increased 
as the signal became more difficult to detect. 

Pigeons can learn a variation of the yes-no method 
that is reminiscent of DeValois’s and Jacobs’s use of 
forced choice with oddity problems. Birds respond 
to one key if two stimuli match and to another if the 
stimuli differ. Wright (1972) used this method to 
study pigeon wavelength discrimination. A center ob- 
serving key contained a bipartite field whose halves 
could be illuminated by the same or by different 
monochromatic lights. A peck on this key illuminated 
two side keys, and a right-key response was then cor- 
rect if the stimulus lights were the same, while a left- 
key response was correct if they differed. Both types 
of correct responses were reinforced, and errors pro- 
duced time-outs. During each session, a single refer- 
ence wavelength appeared with each of a number of 
comparison wavelengths. Unfortunately, it is not clear 
how well the birds discriminated “difference” on the 
center key as a concept apart from the specific wave- 
length differences to which they responded. That is, 
the report does not indicate whether the birds learned 
each new discrimination more quickly than the last. 
Fvidently the task was difficult, however, for 6 out of 
12 birds were eliminated from the study because of 
poor performances. 

A variation of the yes-no procedure, described by 
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Stebbins (1970) and some of his colleagues, has been 
particularly successful in auditory psychophysics. Here, 
one manipulandum has both an observing function 
and a ‘‘no” function. Responses to this lever produce 
the auditory signal on a variable-interval or variable- 
ratio schedule. ‘The subject must switch to another 
lever (“yes”) during the signal to be reinforced; how- 
ever, switching to this second lever either before or 
after the signal brings a time-out. Occasional ‘‘catch” 
trials, when no tone is presented, measure the prob- 
ability of false alarms. Methods such as this have 
yielded clear auditory data for monkeys (Gourevitch, 
1970; Stebbins, 1970) and bats (Dalland, 1970). ‘Che 
procedure has also been used to measure latencies, as 
we shall see in a later section. Figure 9 shows a set of 
psychophysical functions obtained by this method 
with monkey subject. 

As we have indicated, the measures described above, 
though all useful, differ in their suitability for par- 
ticular sensory modalities, in their ease of instrumen- 
tation, and very likely in their suitability for particu- 
lar specics. Unfortunately, we are generally unable to 
say whether they also differ with respect to the 
threshold measures that they yield. A few studies sug: 
pest that methodological differences may not be greats 
the experiments of Sidley et al. {1965}, mentioned 
above, showed similar findings from a method based 
on positive reinforcement and one based on aversive 
control. Mentzer’s (1966) research also failed to reveal 
atty clear difference among thresholds obtained from 
yesng, two-key forced-choice, and four-key forced- 
choice methods. Studies that make such comparisons 
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Fig. 2. A set of psychophysical functions for a single monkey 
showing the relation between frequency of hearing and sound 
intensity. Each curve describes this relationship for a different 
tone frequency. (From Stebbins, 1970.) 
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are, however, rather difficult to interpret, since many 
aspects of each situation are chosen arbitrarily (e.g., 
stimulus placement, response force required, etc.), 
and effects of these parameters could obscure real 
methodological effects. 


ADAPTING THE RESPONSE TO THE SPECIES 


The animal psychophysicist often wishes to study 
species other than the rats, pigeons, or monkeys for 
which reinforcers, manipulanda, and general meth- 
odology have become standardized. Some species, such 
as the cat or the turtle, may be chosen because exten- 
sive sensory physiological data are available. Others, 
such as the bat or the sea lion, are of interest because 
of the special characteristics of a sensory system, Often, 
such “new” species arc found to perform well in rela- 
tively standard situations: for example, turtles, squirts 
rels, seals, and chinchillas all press keys or levers for 
appropriate consummatory rewards. Rut some gpecias 
do not adapt wll ts such familiar tasks, and the éx- 
périmenter must apply his imagination and his know}l- 
edge of the response repertoire of his subjects to the 
design of appropriate experimental arrangements. Re: 
search with bats, described by Dalland (1970), ex. 
emplifies this point. The experiment, inspired by the 
bat’s use ef echelecation im detecting prey, invelyed 
the assessment of the auditory response to very-high- 
frequency tones. Since the nonflying movements of 
bats are minimal and poorly adapted ts the usual 
manipulanda, Dalland chosé to méasurc gross bodily 
moyement. An ebserving response required the bat 
to position itself on a platform in such a way as to 
break a photocell beam. In front of this platiarm WHE 
a tube from which the auditery signal smanated. 
Thus the observing response harmonized with the 
bat’s natural tendency to orient toward sounds, and it 
also ensured constant position with regard to the 
stimulus. In this procedure, the bat was reinforced, in 
the presence of sound, for walking to a food cup. Fig- 
ure 3 illustrates this arrangement. Since bats do not 
readily walk, this response had the advantage of re- 
ducing the probability of false alarms. Although ex- 
tended response shaping was required, Dalland suc- 
cessfuly measured tone thresholds through a range of 
high frequencies. 

Naturalistic considerations also suggested a useful 
response in a study of visual acuity of sea lions. These 
animals emit a “barking” vocalization which the ex- 
perimenters, Schusterman and Balliet (1970), brought 
under visual control, at first by associating the visual 
stimulus with a situation that tended to elicit the 
bark. Eventually they were able to test visual acuity 
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Fig. 5. Apparatus for obtaining audibility data for the bat. 
The bat positioned itself on the platform at the rear of the 
cage in such a way as to break a photocell beam. When a tone 
occurred the subject crossed the cage for food reinforcement. 
The tone came from the tube beyond the platform, (From 
Dalland, 1970.) 


by requiring the sea lions to yocalize when a grating 
stimulus appeared but to remain silent when the 
stimulus was blank. An important advantage of 
vocalization as a response is that it is not tied to a 
spatial location. The authors thus were able to test 
acuity at various controlled target distances. 

Not only an animal’s response repertoire, but also 
the conditions under which the responses ordinarily 
occur may bear on the choice of psychophysical meth- 
odology. The cat, for example, has a reputation for 
recalcitrance in operant situations, but this reputation 
may be due more to obtuse experimenters than to 
obstinate animals. Recent studies indicate that posi- 
tive reinforcement methods are quite feasible if the 
cat is sufficiently hungry and if its task is properly de- 
signed. Berkley (1970), for example, found that the 
natural tendency for the cat to use its paws, though 
apparently ideal for lever pressing, was actually an 
obstacle to visual control, since cats tend to watch 
their paws as they use them. Berkley’s successful ap- 
paratus employed a nose key, which the cat could 
press only after it had placed its head through a hole 
too small to include the paws. The author reported 
rapid success in training visual discriminations with 
this apparatus, and found it more easily automated 
than some earlier methods used with cats. 

Less obvious, but possibly of great importance, are 
considerations of stimulus-response associability re- 
lated to the “preparedness” notion of Seligman (1970). 
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A report by Nye (1973) suggests, for example, that the 
pigeon’s key peck is poorly controlled by laterally 
placed visual stimuli, although performance on an 
identical discrimination is excellent when the stimuli 
are located in the bird’s frontal field of view. This 
find, if confirmed, illustrates the subtlety of species 
idiosyncrasies with which the animal psychophysicist 
may have to cope. 


Methods of Stimulus Presentation 


The experimenter’s goal, in threshold studies, is 
typically to estimate the stimulus value that can be 
detected 50% of the time. Preliminary work is usu- 
ally necessary so that stimuli may be chosen over a 
range that will include the threshold value and across 
which stimulus and response values will systematically 
covary. In working with human subjects, experimen- 
ters commonly omit stimuli far above or below 
threshold, since such presentations are inefficient. In 
animal work, however, some signals must be of known 
detectability if reinforcement contingencies are to be 
effective in waining the animal and in maintaining its 
behavior during testing. Thus there must be a num- 
ber of “wasted” trials, during which the stimulus is 
well above or below threshold. Experimenters have 
adapted, with this modification, most of the well- 
known human psychophysical methods, which pre- 
scribe the spacing of stimuli around threshold and the 
order of stimulus presentation. 


‘TESTING WITH A BROAD RANGE 
OF STIMULUS VALUES 


To obtain complete psychophysical functions, ex- 
perimenters usually choose a stimulus set that ex- 
tends in regular steps from values that are almost 
always detected to values that are very rarely de- 
tected. Once chosen, these stimuli may be presented 
in ascending order of strength (or increasing differ- 
ence between standard and comparison, when a differ- 
ence threshold is sought), descending order, or in 
random order. In animal work, descending order is 
perhaps the most popular, for it seems best to start 
with an easy discrimination and proceed to those more 
difficult. This procedure thus incorporates a “shaping” 
or “fading” aspect. Although this method resembles 
the psychophysical method of limits, stimuli tend to 
be more widely spaced than in a human experiment, 
and each stimulus value is apt to occur in blocks of 
several trials before a new value is introduced. The 
block method is desirable because performance may 
change somewhat over trials, and a number of trials 
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may be necessary to yield a reliable indication of 
performance. In the study of pigeon acuity, illustrated 
in Figure 1, trial blocks were very large; altogether 
each block included 256 trials and occupied two ex- 
perimental sessions. As is true of many animal studies, 
this procedure required the birds to reach a criterion 
level of performance at a strong signal (wide stripes) 
before proceeding with progressively weaker signals. 
The research affirmed the importance of these easy 
sessions, for performance on easy discriminations 
tended to be poor following sessions at the most diffi- 
cult discriminations. 

Hodos and Bonbright (1972) describe a procedure 
that employed much smaller blocks of descending 
stimuli. The study neatly incorporates warm-up trials 
and checks on the subject's discrimination perfor- 
mances. Pigeon subjects discriminated a standard light 
from lights of varying comparison intensities. Fach 
session began with a 20-trial warm-up with a large 
intensity difference (8 log units). The next 20 trials 
constituted an “assessment” period, at the samé larve 
intensity difference,-to determine the subject's basic 
performance level. If the bird’s error rate was greater 
than 10% during this period, no new comparison 
values were introduced, and the entire session was 
devoted to training. If performance met criterion, 
however, the data were recorded and further sets of 
20-bleck trials occurred, cach at a smaller intensity 
difference than the last. After testing at the smallest 
difference, the warm-up condition at .8 log-unit differ- 
ence was repeated. ‘Then a final descending series be- 
gan. ‘The authors state that “the warm-up period be- 
tween the first and second descending series of stimuli 
served to dissipate extreme response biases that de- 
veloped . . . during the later, more difficult discrimi- 
nations of the first sequence and would otherwise be 
carried over into the earlier discriminations of the 
second sequence” (p. 473). 

Some animal studies have used series of ascending 
stimulus values, usually along with descending series. 
Used alone, the ascending order may generate a high 
rate of false alarms (Dalland, 1970). It may also yield 
higher thresholds than the descending series (e.g., 
Mishkin & Weiskrantz, 1959; ‘Terman, 1970), although 
Smith (1970) found similar thresholds by the two 
methods in his conditioned suppression procedure. As 
in human psychophysics, it seems appropriate to aver- 
age the results of ascending and descending series 
when both are used. 

A set of stimulus values may also occur in random 
order. This procedure has yielded orderly psycho- 
physical functions, and we might expect it to main- 
tain the basic discrimination most effectively, because 
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easy discriminations are mixed with more difficult 
ones. In the monkey auditory work of Stebbins, which 
we described earlier, audiometric functions from the 
method of limits and the method of constant stimuli 
appear very similar. The pigeon acuity study (P. 
Blough, 1971) compared a random stimulus method 
with a descending series method, and again the two 
yielded similar thresholds. However, performance on 
the easy discriminations was somewhat worse for the 
random stimulus method, despite the fact that warm- 
up sessions and gradual introduction of less detectable 
stimuli preceded sessions based on random presenta- 
tion. Another disadvantage of random ordering igs that 
it tends to commit the experimenter to a set of 
stimulus values that may later turn out to be inap- 
prepriate, 

As we have said. a threshold is generally defined as 
a stimulus value associated with a criterion response 
probability. Recauge the criterion ig rarely met ex- 
actly, the actual thrashold value usually eames fram 
interpolation ona graph that relates the stimulus and 
response measures, In metheds that use a bread range 
of stimulus values, this graph will usually show a wide 
range of response probabilities, extending from chance 
to nearly perfect performance. ‘To arrive at 9 threshold 
heuré, many éxpériméntérs simply connéct thé two 
peints en cither side ef their thresheld criterien and 
interpolate appropriately. A second procedure. which 
makes use of more of the data, is to fit a function to 
all of the points and then proceed with the interpala- 
tion. (A translermatien which makes the function 
approximately linear is helpful for this purpose. since 
a straight-line fit is relatively easy to achieve either 
statistically or by cyc.) This second precedurs is mere 
complicated, but is worth the trouble when the data 
are variable, since inclusion of all points contributes 
to the reliability of the threshold estimate, 


“THRESHOLD TRACKING 


‘The methods just described are inefficient, in a 
sense, because they include many trials on which the 
stimulus is well above threshold. In tracking methods 
(also known as “staircase,” “up-down,” or “titration” 
methods), most stimuli presented are near threshold, 
so there are fewer wasted trials. However, tracking 
methods may require long training and _ relatively 
complex apparatus. 

Tracking has been used with a number of sense 
modalities in a variety of species. An early applica- 
tion in animal psychophysics produced data on dark 
adaptation in pigeons (D. Blough, 1958, 1961). The 
paradigm was essentially two-manipulandum, yes-no, 
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although the session was not divided into trials. Pecks 
on the “no” key (key B) were reinforced when a stim- 
ulus patch was dark, and pecks on the “yes” key (key 
A) were correct when the stimulus was illuminated. 
Correct key B responses were reinforced with food, 
and key A responses were maintained because they 
occasionally turned out the stimulus light, thus setting 
the occasion for key B reinforcement. Variable-interval 
and variable-ratio requirements helped to prevent 
control of pecking by temporal or numerical factors; 
that is, reinforcement had no close relationship to 
elapsed time or to number of responses emitted. 
Threshold tracking resulted from this procedure be- 
cause responses on the “no” key raised the intensity 
of the stimulus and responses on the “yes” lowered it; 
thus a well-trained subject kept the stimulus light 
near its threshold value, and a continuous record of 
intensity followed the threshold through time. 

The tracking procedure raises special problems of 
stimulus control, some of which have been discussed 
by Blough (1958) and more recently by Clack and 
Harris (1963) and Rosenberger (1970). It is necessary, 
for example, to discourage the subject from simply 
responding alternately to the keys until reinforcement 
occurs. An effective means of discouraging this strategy 
is to have incorrect responses increase the ratio by 
which reinforcement is programmed. Another poten- 
tial problem is the deterioration of stimulus control 
that may occur when a discrimination remains diffi- 
cult over a long period of time. To compensate for 
this, Clack and Harris increase signal strength to a 
rélatively high level following reinforcement to pro- 
vide a “warm-up” period, 

The tracking idea is readily adaptable to several of 
the response paradigms that we have previously de- 
scribed. For example, a combination of tracking and 
conditioned suppression is described by Ray (1970) 
and by Rosenberger (1970). Here, degree of response 
suppression controlled signal strength. When suppres- 
sion exceeded some criterion, signal strength de- 
creased; when suppression no longer met the criterion, 
signal strength increased. A threshold could be de- 
rived from the stimulus values, since these were cor- 
related with response suppression. 

In trialwise tracking methods, the stimulus value 
may change following each presentation (the “stair- 
case” method) or following a block of trials. Thorpe 
(1971) used the latter procedure in a study of goldfish 
spectral sensitivity. Stimulus intensity decreased when 
the number of correct responses per block exceeded a 
criterion and increased when the number fell below 
the criterion. The block method bases each stimulus 
change on more response information than does the 
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single-trial staircase method, and one comparison of 
the methods reports that the block procedure indeed 
yields less variable data (Moskowitz & Kitzes, 1966). 
‘The staircase method, however, provides more thresh- 
old indications per unit time, and this may be an 
overriding factor when sensitivity is to be followed 
through time. 

Are thresholds found by the tracking method com- 
parable'to thresholds from other methods? In Blough’s 
method, reinforcement contingencies for the two keys 
were asymmetrical, since food was contingent only on 
“no” responses (Saslow, 1967). Since “no” responses 
increase the signal’s strength, the resulting thresholds 
could be spuriously high. Some confirmation of such 
an effect comes from Clack and Harris (1963), who 
noted that their rats tended to maintain the auditory 
stimulus at suprathreshold levels. Rosenberger (1970) 
identified another possible difficulty with tracking: 
when the animal controls the signal, responses may 
not be reinforced with equal probabilities at all signal 
strengths. Thus the tendency to switch from key A to 
key B could become stronger in the presence of cer- 
tain signals than for others, with a distortion of 
threshold data resulting. 

Direct comparisons between tracking and other 
methods are rare. Symmes (1962), in his studies of 
flicker discrimination in monkeys, reported that a 
technique much like Blough’s generated lower critical 
flicker-fusion ee ae (higher thresholds) than did 
a go/no—go method using comparable target param- 
eters. However, Stebbins (1970) found that auditory 
thresholds generated by his version of the tracking 
method were very close to those yielded by a method 
of limits and a method of constant stimuli. Perhaps 
Stebbins’s failure to find differences had to do with his 
modification of the technique. While Blough and 
Symmes provided direct reinforcement only for re- 
sponses appropriate to the signal’s absence, Stebbins 
also reinforced with food responses appropriate to the 
signal’s presence, whatever its strength, and punished 
(by a time-out) failure to report a signal. These con- 
tingencies perhaps favor a stronger “‘yes’” response and 
thus a lower threshold than the less symmetrical rein- 
forcement procedure, Such differences among various 
versions of the tracking method seem to preclude any 
blanket statement comparing this method with others. 

The definition of threshold is somewhat less stan- 
dardized in the tracking method than in the other 
methods we have described. Basically, the experimen- 
ter seeks the stimulus value at which the probability 
of switching from the “yes” to the “no” response, and 
vice versa, is greatest. This may be done visually, by 
drawing horizontal lines that bisect various subsets of 
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data and averaging the stimulus values represented 
by these lines (Clack & Harris, 1963). Alternatively, 
reversal points—those stimulus values at which the 
subject switches from key A to key B—may be aver- 
aged. Rosenberger (1970) has described a method for 
treating data from discrete-trial procedures. 


Some Methodological Problems 


Sensory data are most informative when variations 
in the response measure depend solely on changes 
in stimulus conditions or sensory states. Though a 
certain amount of variability from other sources is 
inevitable, the contribution of nonstimulus factors 
may be reduced considerably by careful experimental 
design. The sections below outline some stimulus con- 
trol problems and methods that have been successful 
in coping with them. We shall consider these prob- 
lems in the relatively well-defined context of threshold 
research, but it will be evident that most of the dis- 
cussion is equally applicable to the suprathreshold 
and scaling research to be covered later. In some in- 
stances, we shall return to these maters in the later 
discussion. 


FIXATION AND ATTENTION 


The achievement of good control by an accurately 
specified stimulus requires that a subject be appropri- 
ately oriented and attentive. In most human psycho- 
physics the subject’s physical orientation is adjusted 
with the utmost care, but with animals such control is 
often quite crude. Nonetheless, researchers have 
worked out methods for achieving at least some con- 
trol over this variable. An obvious method is physical 
restraint (as in the monkey restraining chair), but we 
stress behavioral methods here. 

When a single manipulandum is used, an animal 
subject’s orientation may remain roughly constant. In 
an olfactory study with rats, Goff (1961) required that 
the rats press the response lever only with their left 
paws; this condition helped to maintain the rat in a 
relatively constant position with respect to a device 
that delivered the odorant. More explicit control of 
orientation may be achieved through the use of an 
observing response, upon which stimulus presentation 
is contingent. In her study of pigeon acuity, for ex- 
ample, P. Blough (1971) used an observing response 
to position the birds at an appropriate distance from 
the stimulus targets. Other researchers have required 
their subjects to insert their heads through holes in 
order to view a visual stimulus, and, in some auditory 
work, orientation has been controlled by requiring 
the animal to press an appropriately placed lever to 
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turn on the tone. As we have seen, in his audiometric 
works Dalland (1970) required his bats to break a 
photocell beam to produce the tone stimulus (see 
Figure 3). Because tone intensity varied with distance 
from its source, the control of the subject’s position 
during stimulus presentation was crucial in this case. 

The training of visual fixation was an important 
feature of studies by Scott and Milligan (1970) on the 
motion aftereffect in monkeys. This effect requires a 
preexposure condition in which the subject fixates on 
the center of a rotating target for a few seconds. Hu- 
mans perform this task readily with the aid of verbal 
instructions. With monkeys, however, lengthy train- 
ing was required. Using infrared light reflected from 
the subject’s cornea, an observer monitored eye posi- 
tion and gradually shaped the required fixation. Al- 
though this method apparently succeeded, the con- 
stant prescnce of a human observer makes it inefhcient 
for many types of research. Automated techniques 
that precisely control eye position will be a boon to 
visual psychophysics with animals. 

Although various fixating and orienting responses 
can be trained, good stimulus contrel requires, by 
definition, an attentive subject. Thus in psychophys- 
ical studies, a subject's failure to respond to a sional 
may result from a failure of attention as wéll ag fail 
ure of the stimulus to exceed threshold. Ons pre- 
cedure that appears to favor attention is the programs 
ming’ Of aversive consequences for incorrect responses. 
Such a consequence is the time-out, a period during 
which stimuli do not occur and responses are inellec- 
tive. Time-outs are usually programmed to follow 
false positive responses, but in an effort to achiévé 
symmetrical consequences, experimenters may also 
program time-outs to follow failures to report the 
positive stimulus, Occasionally a stronger punishment 
appears to be helpful. Gourevitch (1970) sometimes 
combined shock with time-out in order to improve 
stimulus control in his auditory experiments with 
monkeys. Shock should be used with caution, how- 
ever, because it can cause serious response disruption. 
Still another method that may make the subject “stop 
and look’ is the elimination of reinforcement for any 
response that follows stimulus onset with very short 
latency. In two-key designs, a changeover delay op- 
erates in a similar fashion to discourage rapid alterna- 
tion between keys. In this procedure, reinforcement 1S 
withheld if the time between operation of the keys is 
less than some minimum. Many experimenters also 
cause responses to require a certain amount of effort, 
as in the substitution of a short fixed ratio for a single 
“yes” or “no” response. Observing responses, when 
made to the critical stimulus, may be programmed to 
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require multiple responses, presumably resulting in 
an increased exposure to the appearance of the stim- 
ulus. 

Despite the effectiveness of such techniques, most 
animal subjects are not “perfectly attentive’—that is, 
on some trials they make errors even on easy discrimi- 
nations. Psychophysical functions reflect such errors 
by approaching an asymptote below 100% correct at 
the strongest signal values. Criteria of stable perfor- 
mance, measured during warm-up trials or “catch” 
trials, help the experimenter to estimate the degree of 
such inattention and, if necessary, take measures to 
reduce it. In addition, the experimenter may use a 
correction for attention, such as that suggested by 
Heinemann, Avin, Sullivan, and Chase (1969). This 
correction, based on the upper and lower asymptotes 
of a sigmoidal psychophysical function, assumes that 
the subject is inattentive on a certain proportion of 
the trials and that the probability of being inattentive 
does not vary with signal strength. The second part of 
this assumption is somewhat questionable as a general 
rule, since there is evidence that performance on easy 
discriminations tends to deteriorate after a series of 
trials on difficult discriminations. Perhaps behavior 
associated with attentiveness extinguishes during diff- 
cult discriminations. 


EXTRANEOUS CUES 


Animal subjects have an annoying ability to devise 
response strategies different from those intended by 
the experimenter. These strategies are sometimes based 
on extraneous cues provided by poorly designed ap- 
paratus. Stimulus systems must be free of transients 
such as sounds accompanying the onset and offset of 
the stimulus; auditory apparatus must control for the 
effects of standing waves; equipment must be designed 
so that easily confounded dimensions, such as visual 
luminance and wavelength, may be varied indepen- 
dently. The rise time of stimuli can act as a discrimin- 
able cue, so the onset and offset characteristics of the 
stimulus event should remain constant across stimulus 
values. Improved equipment has been an important 
factor in the growing quality of animal psychophys- 
ical studies, but there are occasions when even the 
most careful experimenter cannot be sure that he has 
perfect control over an extraneous variable. For ex- 
ample, in studies involving stimulus wavelength, even 
small luminance differences between monochromatic 
lights may confound the data. In such cases, the con- 
founding factor may be varied at random over a small 
range in such a way that no particular value will be 
well correlated with reinforcement. 
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Stimulus preferences may also confound psycho- 
physical data. For example, pigeons tend to respond 
more to some colors than to others, although the pre- 
ferred color appears to differ among birds. In the 
auditory modality, the intensity dynamism effect may 
be considered a type of stimulus preference. Sadow- 
sky (1966) has shown, for example, that rats discrimi- 
nate more accurately when the positive stimulus is the 
more intense of two sounds. This effect appears to be 
consistent enough to require controls. Thus in their 
study of the auditory intensity difference limen, Ras- 
lear et al. (1975) made silence the positive stimulus so 
that dynamism effects could not account for the dis- 
crimination in this go/no—go task. An experimenter 
may cope with less predictable preferences by using 
a reasonable number of subjects, a high training cri- 
terion, and insightful data analysis. 

Response bias is one of the worst plagues of the 
animal psychophysicist. In multiresponse designs, for 
example, position preferences are almost always a 
problem. These may be minimized by a correction 
procedure, such that incorrect responses are followed 
by the same stimulus or stimulus array until a correct 
response occurs. Analysis must omit these correction 
trials, of course, since they are nonrandom. The cor- 
rection procedure may reduce but it does not elimi- 
nate effects of position preference, which appear to 
grow more pronounced as signal strength decreases 
(P. Blough, 1971; Terman, 1970). In a forced-choice 
design, the effects of such preferences are relatively 
benign, since the correct stimulus varies in position. 
In a yes-no task, however, preferences may affect the 
threshold by driving it up (if the preference is for the 
“no” response) or down (if the preference is for 
the “yes” response). Similarly, biases that favor re- 
sponding or not responding may affect thresholds 
determined by a go/no—go procedure. 

Sequential dependencies may also serve as un- 
wanted sources of control. A strictly random determi- 
nation of stimulus order is usually undesirable, since 
this procedure may generate a long series of identical 
trials, during which a position preference may become 
strengthened. Unfortunately, however, modified ran- 
dom series may include contingencies that some an- 
imals discriminate. ‘Thus Terman (1970) reported that 
rats apparently discriminated the constraint that a 
given stimulus array could not occur more than three 
times in succession. When performance in a psycho- 
physical task fails to reach chance level even at very 
low signal strengths, the experimenter may suspect 
(as Terman did) the operation of sequential or other 
nonstimulus cues. 
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REINFORCEMENT CONTINGENCIES 


We have discussed previously the role of special 
contingencies in maximizing attention and the role 
that reinforcement may inadvertently play as an ex- 
traneous discriminative stimulus. Other uses of rein- 
forcement may also involve difficulties. For example, 
most threshold procedures program reinforcement for 
correct detections of the signal, regardless of its 
strength. This procedure avoids selective reinforce- 
ment of particular signal strengths and would seem 
to aid the subject in discriminating weak stimuli. 
However, when the stimulus is present but sub- 
threshold,! the procedure may also provide reinforce- 
ment for responses that are, functionally, false alarms, 
Little research has been directed specifically at this 
problem, though Nevin (1970) found that increased 
relative reinforcement at weak signals produced an 
increased false alarm rate. Several studies indirectly 
suggest that reinforcement of positive responses to 
subthreshold signals may not seriously affect threshold 
data. The tratking precedure used by Stebbins (1970), 
for example, included a large proportion of weak 
sionals and must have resulted in occasionally reins 
forced “yes” responses to sounds below threshold. Yet, 
as we have noted, these audibility functions weté Very 
much like these from metheds that included a greater 
proportion of more intense tones. In connection with 
the conditioned suppression method, Ray (1970) dis- 
cussed the effect of “unsipnaled” shocks that pre- 
sumably occurred following subthreshold sounds and 
reported that, when kept to a minimum, these did not 
seem to affect the data. Blough’s version of the track- 
ing method (1958) seemingly avoided the reinforce- 
ment of perceptually inappropriate responses, yet we 
have seen that this method may introduce other sorts 
of bias. Perhaps the advantages of reinforcing correct 
responses to weak but superthreshold signals out- 
weigh the disadvantages of reinforcing “false alarms’ 
to subthreshold stimuli. 


LEARNING EFFECTS 


Performance on some psychophysical tasks may 
continue to improve for an exasperatingly long time. 
In the acuity work previously mentioned (P. Blough, 
1971), successive psychophysical functions increased 


1 We recognize that the “theshold” concept is questioned in 
modern psychophysics. In this discussion, however, we find it 
convenient to speak of “subthreshold” stimuli rather than de- 
veloping a complex argument that assumes a continuum of 
stimulus effects to which response criteria are applied. The 
practical impact of the present argument remains essentially 
unchanged after transposition to the signal detection format. 
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in steepness for many months. Mishkin and Weis- 
krantz (1959) also reported learning effects that lasted 
for a considerable time. The extent of such effects no 
doubt relates to the difficulty of the task. Experimen- 
ters should be alert to this problem and, when run- 
ning a sequence of conditions, either counterbalance 
for order or first ascertain that their functions repre- 
sent a stable asymptote. 


ABSTRACTION OF THE STIMULUS 


Animal subjects appear to differ in their ability to 
form concepts based on relational or abstract proper- 
tics of stimuli. The efficiency of many experiments 
could be improved if. for example. all species could 
transfer readily among specific instances of “match- 
ing” or “oddity,” but only primates, after long train- 
ing, have been cléarly seen to do this. Even matching 
or oddity within a specific dimension (color, fer ¢x- 
ample) may be learned as a series ef specific problems 
(Gumming & Berryman. 1961). However. Honig (1965) 
was able to establish “wavelength stimulus difference” 
a6 4 controlling relation, apparently independent of 
specific wavelength atimuli. Im this study, pipesns 
learned to peck the right ong of two keys if both keys 
were illuminated by the same wavelength and to peck 
the left key if the wavelengths differed. The birds 
could not base their discrimination on any absolute 
wavéléngth difference bécause éach wavéléngth ap- 
peared squally often en beth keys and was paired 
equally often with itself and with a different waves 
length. Che birds also performed correctly with new 
wavelengths. Malott and Maloti (19705 report similar 
generalized matching in single-key tests. There is no 
good evidence, however, that this “same-different con- 
cept” can transfer to a new stimulus dimension in 
the pigeon. 

Because of the abstraction problem and others that 
we have outlined, the measurement of sensory thresh- 
olds is apt to be time-consuming and a continual 
challenge to the experimenter’s ingenuity. Nonethe- 
less, as we have seen, the patient application of well- 
tailored operant methods can yield remarkably de- 
tailed data on the sensory capacities of an animal 
subject. ‘he use of such methods in the study of more 
complex perceptual phenomena raises exciting possi- 
bilities, but has produced many fewer data. We con- 
sider such research in the next section. 


SUPRALIMINAL STIMULI 


The following sections will consider research in 
psychophysical scaling and perception. The nature of 
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these problems is more complex than those involving 
threshold determination, and successful studies are 
relatively few. Such experiments often involve ab- 
stract or poorly defined stimulus variables, and they 
tend to be modeled after human research that de- 
pends on relatively complicated verbal instructions 
and responses. Translating such problems into para- 
digms suitable for animal experimentation has been 
difficult compared to the task of designing threshold 
experiments. In a threshold study, for example, the 
experimenter may program reinforcement on the 
basis of the physical presence or absence of the signal. 
In suprathreshold experiments, however, there is a 
broad range of stimuli for which only the subject can 
define “correct” and “incorrect” responses. We cannot 
tell a subject in a scaling experiment what sound in- 
tensity is ‘twice as loud” as another, nor can we tell 
him in a perceptual study at what relative distance 
two sizes should look equal. Such stimulus control 
must carry over from training in which performance 
can be defined and reinforced. In a size-distance ex- 
periment, for example, the subject might be trained 
with equal physical sizes at equal distances. Test 
stimuli, which would include a variety of size and dis- 
tance conditions, might either follow training presen- 
tations or be interspersed with them. 


Perceptual Studies 


What does an animal “see” when subjected to com- 
plex suprathreshold stimulation? As with human sub- 
jects, the question can often be translated, “What 
different sets of stimulus conditions yield the same 
response?’ For example: “For what combinations of 
input parameters do two sounds seem equally loud?” 
(equal-loudness functions), “What combinations of 
arrowheads make two lines elicit the same response?” 
(Miuller-Lyer illusion), “After exposure to which wave- 
lengths does the subject respond to a neutral stimulus 
as though it were 550 nm?” (colored afterimage). In- 
formation is also gained from the degree to which re- 
sponses are the same in different situations; for exam- 
ple, if a monotonic response measure is used, it may 
be possible to scale the similarity of a set of stimuli. 

It is apparent that perceptual studies, as just 
formulated, are fundamentally studies of generaliza- 
tion or transfer, and operant generalization methods 
have been used increasingly in this area (Malott & 
Malott, 1970; ‘Thomas, 1969). However, the precise 
method to use is not usually obvious. Shall the re- 
sponse measure be relatively built-in, like reaction 
time, or more subject to conditioning, like response 
rate? Will simple training in the presence of one stim- 
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ulus suffice, or must training isolate one stimulus as- 
pect or dimension and associate the response measure 
with this alone? We turn now to examples of research 
in which these questions have been answered in vari- 
ous ways. 


REACTION TimE METHODS 


When stimulus intensity increases, the response, 
whether reflex or learned, decreases in latency. This 
relationship has been used to equate stimuli for in- 
tensity, in much the same manner that lights may be 
equated for luminance by finding intensities that 
leave the pupil a constant diameter. Stebbins (1966) 
used the reaction time method to determine equal- 
loudness contours for the monkey, and in the same 
manner Moody (1969) found equal-brightness con- 
tours for the rat. In these experiments, the animals 
learned to release a lever as soon as a sound or light 
occurred. For a set of different frequencies (Or wave- 
lengths), those intensities that yielded constant reac- 
tion times were called equally loud or equally bright. 
The use of several criterion reaction times gave con- 
tours at several intensity levels. A set of curves from 
Moody (1970) appears in Figure 4. 

We might look at one of these reaction time ex- 
periments in more detail, noting the steps taken to 
Clarify the stimulus-response relationship. Moody 
(1969) trained his rats to enter a viewing tunnel set in 
the wall of an experimental chamber. At the far end 
of the tunnel, lights of various wavelengths could 
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Fig. 4, Equal-brightness functions of the rat determined by the 
latency method. The criterion latency is indicated to the right 
of each curve. (From Moody, 1970.) 
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appear on a frosted glass screen. ‘The rat encountered 
a response lever about halfway down the tunnel, and 
above this lever was a loudspeaker. The rat learned a 
response chain as follows: in the presence of white 
noise from the loudspeaker, press the lever; when a 
light appears on the glass screen, release the lever. 
Lever releases to the light produced water reinforce- 
ment. Since the light did not appear unless the lever 
was held down for an interval (.2 to 3.0 sec), the rat 
was in a relatively constant position when the light 
came on and was presumably ready to react. Stebbins 
(1966) assured constant conditions from trial to trial 
by restraining his monkeys in a chair and delivering 
reinforcements directly into the monkey’s mouth. 
Variability in reaction time may be further reduced 
by adding a “‘limited hold” to the reinforcement con- 
tingency, such that if the subject waits too long to re- 
lease its lever, no reinforcement is forthcoming even 
for “‘correct’’ responses. Moody discusses this and 
other methodological matters in his review of the 
method (Moody, 1970). 

The reaction time method has the advantage that 
the relationship between the stimulus variable and 
response is built in. The necessary training, designed 
to shorten reaction time and reduce its variability, 
serves only to exhibit this unlearned rélationship 
more clearly. The method thus avoids the difhcult 
task of teaching an animal an arbitrary stimulus- 
response relationship and maintaining that relation- 
ship under novel test conditions. However, the reac- 
tion time method suffers the disadvantage that the 
unlearned stimulus-response relationship may not pro- 
vide the information that the experimenter seeks. It 
has thus far been applied only to intensitive dimen- 
sions, and on these it has not permitted one to say by 
how much one stimulus is louder or brighter than 
another, but only to indicate stimuli that are equal. 
It may be possible to surpass both of these limita- 
tions; Moody (1970) suggests that modified choice of 
discriminative situations may make reaction times 
applicable to qualitative dimensions. The possibility 
that a metric scale may be based on reaction time 
data is touched on below. 


SIMPLE GENERALIZATION METHODS 


At the present writing, generalization methods are 
the most widely used operant techniques in the study 
of animal perception. Of these, the simplest involves 
merely training an animal to respond in the presence 
of a given stimulus and then systematically varying 
the stimulus as the response extinguishes (Guttman 
& Kalish, 1956). The number of responses emitted to 
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each stimulus value during the extinction test allows 
the values to be ranked according to their similarity 
to the training stimulus. 

This simple generalization procedure frees the ex- 
perimenter to choose the most convenient response 
and also the most informative set of test stimuli. ‘The 
stimuli need not come from a physically defined “‘di- 
mension” such as intensity; indeed, the generalization 
test may be designed to identify possible stimulus di- 
mensions. Exemplifying the flexibility of the method 
with regard to response measures and test stimuli 1s 
a study of taste by ‘Tapper and Halpern (1968), Rats 
were irradiated with gamma rays as they drank a 
chemical solution. Such irradiation makes the animal 
sick, and under suitable conditions a single exposure 
induces an aversion to the taste of the substance as- 
sociated with irradiation. Thus in this case the re- 
sponse was a decrease in drinking. The test stimuli 
were a number of different chemical solutions; aver- 
sion (decreased drinking) generalized to some of these 
but not others. Tapper and Halpern suggest that the 
method may be a way to determine the diménsions 
along which taste quality is represented in the rat. 

The work of Thomas and his colleagues (Thomas, 
1969) on perception of the vertical exemplifies the 
use of généralization tests td clarify complex interac. 
tion of perceptual variables, In ene such study, pi- 
geons were trained to peck at a vertical line projected 
on a response key in a lighted box. When the birds 
were exposed successively to a numbér of lifés set at 
different angles, they responded most to the line that 
was “‘visually vertical” (i.e., at right angles to the ceil- 
ing) even when the box was tilted around the bird 
while the floor remained horizontal. Birds trained in 
a dark box while standing on a horizontal floor, but 
tested in the dark on a floor tilted 24° to the left, 
responded most to a line tilted approximately the 
same amount to the left. The experiments suggest that 
visual cues predominate in the birds’ perception of 
vertical, but when these are absent postural or kines- 
thetic cues may be used. ‘Thomas relates these results 
to a similar work on “sensory-tonic” and “field- 
dependency” effects in human subjects. 

The simple generalization test, with single-stimulus 
training followed by extinction, has serious limita- 
tions, however. Since the data are obtained during 
extinction, responses are relatively few and data 
points are variable. The procedure singles out no 
particular aspect of the stimulus, so the aspect to be 
tested may control the measured response weakly or 
not at all. For example, after simple exposure train- 
ing, the wavelength of light on a pigeon’s pecking 
key does control response probability; line tilt also 
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does, though less strongly; tone frequency does not, 
without additional discrimination training. We turn 
now to the uses of such additional training. 


GENERALIZATION AFTER 
DIFFERENTIAL REINFORCEMENT 


Perceptual studies commonly incorporate differ- 
ential reinforcement, in order to improve upon the 
relatively weak and nonspecific control often gen- 
erated by single-stimulus training. The simplest way 
to strengthen control is to reinforce responses in the 
presence of the stimulus and omit reinforcement in 
the absence of the stimulus. Jenkins and Harrison 
(1960) found this sufficient to bring the pigeon’s peck- 
ing response under the control of tone frequency. 
When an interrupted tone (“beep-beep-beep . . =) 
was on continuously throughout a pigeon’s key-pecking 
experience, the rate of pecking did not change when 
the frequency of the tone departed from its training 
value. However, when birds were trained to peck only 
when the tone was on and not when it was off—a dis- 
crimination presumably irrelevant to the frequency 
dimension—subsequent tests revealed strong control 
of response rate by tone frequency. 

In the Jenkins and Harrison experiment, reinforce- 
ment in the presence of a stimulus, contrasted with 
nonreinforcement in its absence, may be said to “call 
the subject’s attention” to the stimulus. However, no 
particular aspect of the stimulus was singled out. Jen- 
kins and Harrison would probably have observed 
response changes to variations, for example, in the 
intensity or harmonic content of their tone stimuli, 
had these been tested. To increase control by a spe- 
cific stimulus aspect, one may employ differential 
reinforcement of stimuli differing in this aspect alone. 
Such differential reinforcement during training may 
be accompanied by equal reinforcement of irrelevant 
stimulus variations. We mentioned an instance of this 
above: the equal reinforcement of randomly varying 
stimulus luminances, in studies where wavelength was 
crucial but luminance was irrelevant. 

Studies of interocular transfer of line tilt illustrate 
these methodological points. They also illustrate the 
uncertainty about the nature of the controlling stimu- 
lus which may remain, even following differential 
reinforcement of a seemingly obvious aspect. Mello 
(1966) studied transfer of a tilt discrimination from 
one eye to the other by fitting pigeons with goggles 
that restricted vision to the anterior visual field of 
one eye. She then reinforced presentations on the 
pigeon’s pecking key of a 45° line (S+) while presenta- 
tions of the line tilted in the opposite direction (135°) 
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went unreinforced (S—). During S— periods, pecks also 
delayed the return of the 45° S+. Following such 
training, ‘‘angle of tilt’”” would be expected to control 
pecking more than other aspects of the line, such as 
color, that were the same during both $+ and S-— 
periods. 

After this training period, Mello tested her birds 
with a series of line tilts. When tested with their 
trained eye uncovered, the birds responded most (as 
expected) to the 45° line to which they had been rein- 
forced. With the untrained eye uncovered, however, 
they responded most to a tilt of 135°, the mirror 
image of the training pattern. This result could have 
interesting implications for the nature of the pigeon’s 
stimulus processing mechanisms; Mello’s comments 
on this matter concerned the representation in the 
brain of patterns viewed monocularly. 

However, the interpretation of the mirror image 
data is open to conflicting interpretations, and these 
exemplify hidden assumptions that may underlie per- 
ceptual studies. Mello assumed, quite naturally, that 
angle of line tilt controlled behavior in the training 
and test situations. Corballis and Beale (1970) sug- 
gested, however, that the controlling stimulus in this 
situation may best be defined as “up” versus “down,” 
rather than angle of tilt. A bird trained with its left 
eye covered has most of its left field of view occluded: 
if such a bird attended only to the right half of the 
stimulus key, a 45° line would be “stimulus in the 
upper part of the field” and a 135° line would be 
“stimulus in the lower part of the field.” If, when the 
right eye was covered, the left eye similarly attended 
only to the left half of the key, the 135° line would 
now be “upper” and the 45° line “lower.” Thus a 
discrimination between “upper” and “lower,” per- 
formed with either eye, would cause the apparent 
mirror-image reversal reported by Mello. Corballis 
and Beale did tests with other sorts of line stimuli 
that seem to support their contention. We may con- 
clude that the controlling aspects of even seemingly 
simple stimuli cannot be taken for granted; the ex- 
perimenter may have one stimulus aspect in mind, 
and the animal subject another. 

The interocular transfer experiments also suggest 
the importance of possibly unlearned perceptual fac- 
tors in determining the nature of controlling stimuli. 
Another example from recent research concerns “‘fea- 
tures,’ or salient portions of visual patterns, which 
seem crucial to pigeons in determining the course of 
pattern discrimination learning (Sainsbury, 1971). 
This feature-dependent control would surely affect 
research on pattern perception in which birds peck 
at complex visual targets. 
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In some cases, such as the Tapper and Halpern 
taste research cited above, the dimensions of complex 
stimuli are admittedly unknown. The object of differ- 
ential training then is not to call the subject’s atten- 
tion to a supposedly known dimension, but rather to 
establish control by known physical stimuli. Some 
experimenters then use generalization tests to identify 
the manner in which the subject classifies or dimen- 
sionalizes the stimuli. A good example of such work 
is a study by Sutherland (1969) on shape perception 
in rats, octopuses, and goldfish. Sutherland taught his 
subjects to discriminate between a square and either 
a horizontally oriented parallelogram or a vertically 
oriented parallelogram. He then presented a variety 
of shapes and recorded the percentage of trials on 
which the subjects gave the “square” response. These 
percentages were used to rank the test shapes with 
respect to their similarity to the square. From his 
inspection of this ranking, Sutherland concluded that 
octopuses secm to discriminate squares from parallelo- 
grams on the basis of the “presence or absence of thin 
horizontal or vertical segments,’ while rats discrimi- 
nated on the basis of “oblique contours running in 
the same direction as the contours of the original 
parallelogram.” (Goldfish seemed somewhere in be: 
tween.) Sutherland incorporated these conclusions in 
a tentative receptive-field model of shape discrimina- 
tion. 

Generalization following differential training may 
not be simply a matter of recording the rate or prob- 
ability of response to a set of test stimuli, Several 
manipulanda may be involved, and whole patterns of 
discriminative response may transfer to the test situa- 
tion. A case in point is the measurement, by Scott and 
his co-workers (1963, 1970) of the spiral aftereffects in 
monkeys. We discussed previously the measures these 
investigators took to ensure the monkey’s attention 
to and fixation upon a preexposure target. The train- 
ing task was a discrimination between an expanding 
bright circle (right-hand lever correct) and a con- 
tracting circle (left-hand lever correct). Every second 
correct press produced a food pellet, while presses of 
the wrong lever yielded a mild shock. Following this 
differential reinforcement, a series of test stimuli 
yielded a psychometric function relating the percent- 
age of left-lever responses to rate and direction of 
change in circle diameter. Preexposure to an “expand- 
ing” or “contracting” spiral shifted this function, as 
shown in Figure 5. Notably, the method provided a 
quantitative measure of the aftereffect; the rate of 
circle expansion or contraction at which the monkey 
judged the circle “stationary” was affected by pre- 
exposure in much the same way as in humans. How- 
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Fig. 5. Proportion of left-lever (“contracting”) responses as a 
function of rate of change of circle diameter, following pre- 


exposure to three spiral conditions; Stationary—open circles; 
counterclockwise rotation (“contracting —filled circles; clock- 


wise rotation (expanding”)—triangles. (From Scott & Powell, 
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ever, the use of monkey subjects enabled the investi« 
gators to carry out a subsequent study of the effects 
of certain brain lesions on this perceptual phenome- 
non. 

The spiral aftereffect experiments provide a good 
example of the problems caused by the experimenter’s 
inability to define the “correct” test respenses and 
hence his inability to maintain reinforcement during 
testing. To minimize the difference in reinforcement 
conditions between training and testing, Scott and 
Milligan (1970) sometimes reinforced correct re: 
sponses during testing on trials when the circle was 
expanding or contracting so rapidly that its motion 
would override any aftereffect. One monkey appar- 
ently detected this contingency: “He seemed to dis- 
cover that when the spiral was rotating, difficult dis- 
criminations never resulted in shock, and the animal 
responded with a stereotyped left lever response to all 
circles moving at less than 1.7 minarcs per second” 
(p. 354). The maintained generalization procedure, 
next to be described, accepts the development of dis- 
crimination in the testing situation as the price for a 
more copious flow of test data. 


MAINTAINED GENERALIZATION 
PROCEDURES 


A generalization test in extinction is often ade- 
quate to suggest the major characteristics of generali- 
zation gradients, especially if group averages are used. 
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However, for meaningful individual gradients show- 
ing some detail, it is necessary to increase overall 
responding by incorporating reinforcement into the 
test procedure. ‘The simplest way to do this is to inter- 
sperse some reinforced training stimulus presentations 
among randomized test stimulus presentations. Usu- 
ally, the reinforced trials can occur rather infre- 
quently and still maintain responding. 

Recently, P. Blough (1972) used the maintained 
procedure to investigate a number of regions on the 
visual wavelength continuum. Her data showed 
marked differences among spectral regions in the 
shape of the generalization gradients; the steepness of 
the gradient slopes were reasonably consistent with 
wavelength discriminability data (cf. Wright, 1972, 
discussed above). A set of these gradients appears in 
Figure 6. The figure also shows an important feature 
of the maintained procedure: the gradual sharpening 
of gradients around the reinforced stimulus. As an 
experiment continues over many hours, gradient 
steepness gradually approaches an asymptote that de- 
pends on the discriminability among the test stimuli. 
The maintained procedure thus represents a transition 
from “pure generalization” data, where the subject’s 
response is relatively unconstrained by differential 
reinforcement, to “discriminability’ data, where re- 
sponse is maximally constrained and limits are set by 
factors such as sensory acuity. 

Because the maintained generalization gradient 
provides a continuous stream of quantitative data, it 
is useful for scaling and signal detection work, which 
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Fig. 6. Maintained wavelength generalization gradients around 
reinforced stimuli in six spectral regions. Responses to the value 
indicated by the arrow in each panel were intermittently rein- 
forced. Other stimuli were presented in extinction. Data are 
averaged across three birds at two different stages of training. 
‘The points connected by solid lines are spaced at 4-nm steps 
were obtained somewhat earlier than those connected by dashed 
lines. (From P. Blough, 1972.) 
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requires a substantial data base. The procedure is 
advantageous also for the investigation of discrimina- 
tive processes such as stimulus summation and atten- 
tion, which are considered elsewhere in this volume 
(Chapter 16; cf. D. Blough, 1969, 1972). A serious dis- 
advantage of the maintained procedure is that re- 
sponse to stimuli highly discriminable from S+ may 
fall rapidly to an uninformative zero level. Thus the 
procedure is most useful with sets of quite similar 
stimuli. 


Scaling 


With one or two exceptions, meaningful scales of 
stimulus continua produced by animal subjects are 
still more a hope than a reality. We consider scaling 
here because of promising results from a few experi- 
ments, interesting methodological beginnings, and the 
potential usefulness of such data. Adequate scales may 
simplify our conception of the processes of behavioral 
control by aiding us to distinguish influences at- 
tributable to sensory or perceptual systems from those 
attributable to other variables. For example, we are 
in a better position to study discrimination learning 
if we can present our subject with sets of stimuli that 
are equally different, since the effects of other vari- 
ables may then be isolated with more success (for 
such a use of scaling see D. Blough, 1972). Scales also 
tell us about the functioning of sensory processes 
themselves. For example, sensory transduction must 
be consistent with a power law of sensory magnitude, 
if that law is correct; similarly, color coding mech- 
anisms must be consistent with the “cglor circle” such 
as that determined by Schneider for the pigeon 
(Schneider, 1972; see below). 

Luce (1972) identifies three kinds of psychophysical 
measurement that we may use as a guide to animal 
work. First, and currently most promising in animal 
work, are studies that derive scales from ordinal data. 
In this kind of scaling, the response measure (for ex- 
ample, rate or probability) is used only to provide in- 
formation about order; that is, for example, if stimu- 
lus A elicits 30 responses, B elicits 15 responses, and CG 
elicits 14 responses, the method uses only the informa- 
tion that responses to A > responses to B > responses 
to C. Many quite different data sets (such as 30, 27, 1) 
would of course yield the same order. The method has 
the great advantage that nothing is assumed about 
the response measure except that it is monotonically 
related to the similarity or psychological distance be- 
tween the corresponding stimuli. 

Shepard, a major developer of this method (e.g., 
Shepard, 1966), applied his analysis to existing wave- 
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Fig. 7. Generalization gradients based on data from Guttman 
and Kalish (1956) (solid lines) and from Blough (1961) (broken 


lines). Note the uneven spacing of the wavelengths; their relative 
distances have been adjusted to increase the uniformity of the 
gradients. (From Shepard, 1963.) 


length generalization data from pigeons (Shepard, 
1965). He showed that a transformation of the wave- 
length scale would convert these gradients, which 
come from overlapping regions of the spectrum. to 
approximately the same shape. Some converted gra- 
dients appear in Figure 7; note that wavelengths on the 
abscissa are irregularly spaced, for units on this axis 
now represent psychological distance or similarity. 
The ordinate in Figure 7 is response frequency, but 
the wavelength scale would be unaffected by any 
y-axis transformation that preserved the order among 
the points. For example, after a logarithmic trans- 
formation of the ordinate, each gradient would still 
be the same shape as the others, even though this 
common shape would change. Thus such a trans 
formation would not call for any change in the stim- 
ulus scale, 

It is helpful to think of distance along the abscissa 
in Figure 7 as representing similarity, to the pigeon. 
among the various wavelengths. However, this par- 
ticular seale does not actually represent sumularity 
among hues very accurately. For one thing, stimulus 
luminance was not controlled in the experiments 
upon which the curves are based. More importantly, a 
single dimension is inadequate to represent the sim- 
ilarity among hues judged by human subjects, and 
evidently this is also true of pigeons. A more adequate 
two-dimensional map of hue for the pigeon has been 
developed by Schneider (1972), in the most extensive 
scaling job to be found in the animal literature to 
date. Schneider based this map on wavelength differ- 
ence ratings from pigeons, using a method much like 
that employed by Honig (1965) to study the wave- 
length difference dimension. In Schneider’s study, two 
wavelengths appeared as a split field on the center key 
of a three-key pigeon box. An observing peck at the 
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split field lighted the side keys. If the two wavelengths 
were the same, a peck at the right key was correct, but 
if they were different, a peck at the left key was 
correct. Correct responses were always signaled by a 
feeder light flash, with food coming on 209, of the 
correct trials. The measure of dissimilarity of two 
stimuli was the percentage of left key pecks given 
when these stimuli appeared together. 

Schneider used 12 wavelengths (66 pairs) spaced 
across the spectrum in one experiment and 15 wave- 
lengths (105 pairs) in a second. The percentages of 
left-key pecks to all the pairs were ranked, the ranks 
averaged across birds, and the averages used to derive 
a two-dimensional spatial representation of the dis- 
tance between the various stimuli. This representation 
appears in Figure 8. An appreciation of the rélation 
between the vaw data and this figure may be aided if 
one imagines straight lines drawn across the diagram 
to connect various pairs of wavelengths. A long line 
signifies that the pair elicited many “different” pecks: 
if the lengths of these lines were ordered fram giéatest 
to least, their order would (ideally) match the erder of 
the ranked key-peck data. A provocative feature of 
this method is the extent to which mere rank-order 
information can determine thé distance between 
points in a diagram such as Figure 8. The form of 
Schneider's function is also very interesting, for 
human ratings of color similarity alse yield a roughly 
circular configuration, with long and shert wayce- 
lengths perceptually similar. The diagram suggests 
that, despite important anatomical differences, the 
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Fig. 8. A two-dimensional color space for the pigeon, based on 
the rank-ordering of wavelength pairs by “same-different” pecks. 
See text. (From Schneider, 1972:) 
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color mechanisms of the pigeon and man may be 
similar in some ways. 

Direct scaling methods use more information from 
the response measure than simple rank order. In 
human psychophysics, magnitude estimation exempli- 
fies these methods. In animal work, attempts have 
been made to use response rate for the same purpose. 
For example, Herrnstein and Van Sommers (1962) 
used rate in an attempt to scale visual stimulus inten- 
sity to the pigeon. In this study, pecking rate corre- 
sponded roughly to a human subject’s assignment of 
numbers to stimulus intensities in the magnitude 
estimation procedure. To bring rate under the control 
of intensity, these investigators first trained birds to 
respond at different rates to several different visual 
intensities with increased rates required for higher 
intensities. New stimuli, in between the training 
stimuli in intensity, were found to elicit intermediate 
rates. hese rates, taken as estimates of the magnitude 
of the intermediate stimuli, were roughly predicted by 
a power function relating intensity and magnitude. 

As Boakes (1969a) points out, the Herrnstein and 
Van Sommers method for direct sensory scaling seems 
to depend on the establishment of a general relation 
(“the brighter the light, the faster I go’’) rather than 
on the attachment of specific rates to specific inten- 
sities. In other words, for the method to be effective, 
the response measure must change continuously and 
monotonically with the stimulus dimension under 
study. Furthermore, the function representing this 
change must not be tied to specific parameters of 
training, such as the particular response rates rein- 
forced. It is not clear that either of these conditions 1s 
met in the Herrnstein and Van Sommers work; the 
results might, for example, represent the generaliza- 
tion to the test stimuli of the specific rates controlled 
by nearby training stimuli. In light of these objec- 
tions, Boakes (1969a) recommends the indirect method 
of bisection, which makes fewer assumptions, and he 
has used this method with some success in the bisec- 
tion, by pigeons, of a brightness interval (1969b). In 
this case, pigeons learned to peck right for a bright 
stimulus and to peck left if the stimulus was dim. ‘The 
stimulus that produced equal pecking on each key was 
then assumed to bisect the brightness interval. 
Boakes’s data suggest that a power function describes 
the relation between visual intensity and subjective 
magnitude in his pigeons. However, the prevalence of 
position preferences, which we discussed earlier, could 
seriously interfere with this method. Since these pref- 
erences tend to be most pronounced during difficult 
discriminations, they might be expected to affect re- 
sponse choice for intermediate stimuli. This effect may 
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account for individual differences in the data from 
some of Boakes’s birds. 

Because operant response probability is easily con- 
ditionable and is subject to so many nonsensory in- 
fluences, it seems unlikely that it can be used con- 
veniently as a direct scaling measure. The relation 
between reaction time and intensity, on the other 
hand, seems largely built into the organism, and 
this measure may provide information on the trans- 
duction of intensity. As we have already seen, equal 
loudness and equal brightness contours have been 
successfully based on equal reaction times (Moody, 
1969, 1970; Stebbins, 1966). Stebbins suggested that 
loudness for the monkey might be scaled in units of 
reciprocal response latency. Recent work with human 
subjects (Mansfield, 1970; Vaughan, Costa, & Gilden, 
1966) suggests that human reaction time is a power 
function of intensity, just as are numerical magnitude 
estimates. Aikin (1973) compared reaction time scales 
with magnitude estimation scales for the same sets of 
auditory and visual stimuli and found them very 
similar, Perhaps reaction time scales may serve the 
function in animal psychophysics that magnitude esti- 
mation scales serve in human psychophysics, but much 
remains to be done before this correspondence can be 
accepted. 

Biases introduced by training conditions and rein- 
forcement contingencies clearly hamper direct scaling 
to animal studies. Probabilistic models of information 
processing, such as signal detection, hold out the hope 
that the effects of bias may be extracted from psycho- 
physical data, and “‘pure’”’ sensory scales might result. 
The d’ value of detection theory, for example, could 
serve as the unit for a sensory scale which, if the 
assumptions of the theory were met, would be inde- 
pendent of reinforcement and other nonsensory biases. 
Though such scaling has not been attempted with 
animals, signal detection theory has played an in- 
creasingly important role in other areas of animal 
psychophysics, and we shall consider this matter in 
the following sections. 


SIGNAL DETECTION THEORY IN 
ANIMAL PSYCHOPHYSICS 


Experimenters who study sensory processes usually 
seek data that reflect stimulus effects unconfounded by 
variables that affect behavior through other channels, 
such as motivation, attention, or response bias. We 
have already discussed some procedural measures de- 
signed to minimize the impact of these other vari- 
ables. Despite such techniques, however, confounding 
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effects remain, and efforts to isolate them through 
appropriate data analysis continue to be part of psy- 
chophysical research. ‘The theory of signal detection is 
a relatively recent mode of analysis that attempts to 
separate sensory from other influences, which are 
lumped together as “bias.” 

The theory of signal detection became popular 
largely because it seemed to account more adequately 
than did classical theory for the pattern of detection 
responses to weak stimuli. In the present operant con- 
text, we can express a major difference between signal 
detection theory and classical theory in terms of their 
assumptions about the nature of stimulus control. 
Classical theory says, in effect, that detection responses 
are entirely under stimulus control on most trials. A 
weak stimulus sometimes is below threshold, leading 
to a “no” response. Classical theory further assumes 
that on some percentage of the below-threshold trials 
(which experimenters try to minimize) unspecified 
nenstimulus variables cause the subject to emit a 
“yes.” These “false alarms” are pure guesses, unre- 
lated to the stimulus. Signal detection theory, on the 
other hand, holds that the stimulus controls response 
on all trials, but this control is shared with other 
(bias) variables. The theory incorporates a statistical 
scheme by which stimulus effects are, in effect, added 
to bias effects to determine response outcome. 

The assumptions of detection theory lead to the 
prediction, widely confirmed in human experiments, 
that changes in signal strength on the onc hand, and 
changes in bias on the other, affect the probability of 
a detection response in characteristic ways. The out- 
come of these changes is clarified by the sort of graph 
common to signal detection presentations, in which 
correct detections (“yesses” to a signal) are plotted 
against false alarms (“‘yesses’”’ to non-signals). When a 
sipnal of constant strength is presented many times, 
while one or more other factors are varied (e.g., the 
relative reinforcement of “yes” and “ne” résponsés), 
points along a “receiver operating characteristic” 
(ROC) or “isosensitivity” curve result. If signal 
strength changes, a new ROC curve is produced. 
Examples of such curves are shown in Figure 9. ‘he 
most common form of signal detection theory predicts 
that these ROC curves will have the form shown in 
Figure 9A and that they will become straight lines, as 
in Figure 9B, when the coordinates are transformed 
into standard deviation units (“z-scores’”’). These dia- 
grams separate graphically the two kinds of variables, 
signal and bias, with which the theory deals: signal 
changes move a data point from ROC curve to ROC 
curve, toward or away from the main diagonal; bias 
changes slide the point along a curve. The sensitivity 
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index d’ measures the stimulus effect: it represents the 
distance of a curye from the diagonal. We cannot go 
further into the basic theory in this chapter; the 
réadé¥ unfamiliar with signal detestion theery is 
urged to consult a reference such as Engen (197 1) or 
the more comprehensive Creen aiid Swets {1966}, 

The primary variables that affect performance, ace 
cording to detection theory, are signal strength, signal 
ptobability, and reinforcement csontingencics. Since 
these variables are easily manipulated in animal dis- 
crimination ¢xperiments, and because such studies 
may feasibly involve many thousands of responses, 
animal experimenters have been increasingly inter- 
ested in using this sort of analysis. This has been done 
not only to clarify sensory processes, but also to in- 
vestigate the assumptions of detection theory and to 
find out more about the actual interactions of the 
variables controlling behavior. 

The use of animal subjects has led to some re- 
visions in the rationale behind assumptions of detect- 
ability theory, if not in their form. For example, the 
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human observer is usually assumed to be “fully in- 
formed” about the distribution of input events that he 
will encounter, and he is thought to make a rational 
decision leading to such a goal as “maximum number 
of correct responses.” An analysis more congenial to 
most animal investigators is provided by Boneau and 
Cole (1967), who show that, if a subject adjusts its re- 
sponses to “most food for least work,” its contact with 
the reinforcement contingencies can provide informa- 
tion comparable to that available to the “fully in- 
formed” human observer. Detection theory also as- 
sumes that trials are statistically homogeneous; that 1s, 
essentially the same statistical processes control detec- 
tion on every trial. This assumption is rather rarely 
mentioned in human studies, but must be carefully 
considered in animal work, where occasional “lapses 
of attention” seem almost inevitable. We shall return 
to this matter below. 


Some Animal Studies Using Signal 
Detection Analysis 


A number of recent studies have subjected animal 
data to signal detection analysis. Of these, however, 
few have used the analysis as an aid in determining 
sensory functions. One of the few is the wavelength 
discrimination study by Wright (1972), whose method 
was previously described. In this study, the pigeon 
made a choice peck at one key if two wavelengths 
were the same, and at another key if they differed. In 
order to provide the bias changes that are necessary to 
generate ROC curves, Wright varied the relative prob- 
ability of reinforcement for responses to the two 
choice keys. Families of such curves arose from the use 
of a number of wavelength differences (corresponding 
to different “signal strengths’) at each reference wave- 
length. 

Wright organized his data by separating correct de- 
tections (pecks on the “‘different’’ key when the pro- 
jected wavelengths differed) from false alarms (pecks 
on the “different” key when the wavelengths were the 
same). This pair of values was obtained for each set 
of wavelengths under each reinforcement condition, 
and such a pair provided the coordinates for a single 
point on a ROC diagram. Variation of reinforcement 
probability, with wavelength difference constant, pro- 
vided the set of points for an isosensitivity (ROC) 
function; each wavelength difference provided a new 
function, as exemplified in Figure 4 by the data from 
Moody (1970). In Wright’s data, the curves shift 
toward the upper left corner of the diagram as wave- 
length difference increases. 

When correct detections and false alarms are scaled 
on normalized coordinates, the most common form of 
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detection theory predicts that the ROC curve for each 
stimulus condition will be parallel to the main 
diagonal as shown in Figure 9B. The distance be- 
tween the ROC curve and the diagonal then may be 
summarized by a single number, d’, which represents 
the subject’s sensitivity to the stimulus that yielded 
the line. However, most of Wright’s data, like most 
results from human subjects, are best fit by lines that 
converge, so no single distance index can describe the 
bird’s sensitivity for a given stimulus condition. A 
variety of measures have been used to cope with this 
situation (cf. Green & Swets, 1966). Wright adopted as 
his measure the distance of the ROC curve from the 
major diagonal at the point of “zero bias,” that is, 
where the curve crosses a minor diagonal in a diagram 
such as that of Figure 9. From the d’ values de- 
termined in this way, Wright constructed psycho- 
metric functions relating d’ to wavelength difference 
at each reference wavelength that had been tested. 
From each of these functions he picked off that wave- 
length difference that yielded a d’ of 2.0, and this set 
of wavelength differences yielded detailed discrimin- 
ability functions. 

In a more recent article, Wright (1974) used these 
results to develop a model of discrimination that 
brings signal detection theory together with more 
classic views of the psychometric function. He argues 
that when discrimination is measured. under equal- 
bias conditions the psychometrig function becomes a 
straight line if plotted with d’ as a function of stim- 
ulus difference. Further, this straight line passes 
through the origin, so its slope provides the appro- 
priate index of discriminability. Wright argues fur- 
ther that this conceptually simple picture has been 
obscured, in much psychophysical work, by procedures 
that introduce strong response bias. For example, 
human subjects have often been strongly cautioned 
not to give “false alarms’ and thus have a strong bias 
against “yes” responses. We cannot delve further here 
into Wright’s provocative article, but it clearly illus- 
trates two central points: (1) the great significance of 
response bias and (2) the general utility of using 
animals in psychophysical work. It is most unlikely 
that Wright could have obtained from humans the 
very extensive results upon which his theory is based. 

Studies by Clopton (1972) and Elsmore (1972) 
exemplify the manipulation of bias through changes 
in the probability of signal presentation rather than 
the probability of reinforcement. Clopton’s monkeys 
discriminated increments in noise intensity, and his 
ROC curves are well fit by functions based on Ray- 
leigh distributions rather than on the more usual 
normal distributions of noise and signal plus noise. 
The monkeys in Elsmore’s experiment were trained to 
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Fig. 10, Duration discrimination data from two monkeys. The 
animals discriminated 60, 80, or 90 sec from 100 sec. The points 
along each line represent different response biases induced by 
alterations in the relative probability that the short or long 
durations would be presented. (From Elsmore, 1972. @ 1972 by 
the Society for the Experimental Analysis of Behavior, Ine.) 


press one lever for a long stimulus duration and 
another lever for shorter durations. By changing the 
relative probability of the long and short durations. 
Elsmore successfully changed the bias toward the two 
levers, producing the rather nice data exemplified in 
Figure 10. Elsmore also analyzed the interaction be- 
tween signal probability and the probability of rein- 
forcement on the two levers. ‘Uhis analysis exemplifies 
the sort of interactions of variables that may operate 
in a detection situation, and we shall illustrate the 
main idea by a simple example. Suppose that a long- 
duration stimulus (the “signal”) occurs on many trials 
and a short-duration stimulus occurs on relatively 
few. If the reinforcement schedule for the two stimul: 
is the same (as in this case), more reinforcements will 
be obtained for pressing the lever indicating “long 
duration.” Elsmore showed that the “optimal response 
bias” that maximizes reinforcement probability under 
these circumstances is very close to the actual bias that 
he observed. 

To obtain ROC curves through experimental 
manipulation of the subject’s criterion, as exemplified 
above, one requires a great many trials at a number 
of values of the biasing variable. In human psycho- 
physics, the “rating method” has been used to produce 
ROC curves more quickly. This method assumes, in 
effect, that a subject has a number of criteria at the 
same time, each controlling a response that indicates 
a degree of certainty about the signal. Thus if a signal 
is not strong enough to make the subject say, “Yes, I 
am very sure a signal was presented,” or even, “Yes, a 
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signal was presented,” it may still suffice to make him 
say, “Yes, a signal might have been presented.” Each 
of these degrees of certainty, or ratings, represents the 
effect of a different bias and generates a point on the 
ROC curve for a given signal strength. D. Blough 
(1967) applied this method to animal subjects, treat- 
ing the pigeon’s rate of pecking as an index reflecting 
its rating of a stimulus. This study used stimuli on a 
wavelength continuum; the stimulus set included a 
single $+ value, presentations of which were inter- 
spersed with many S~ presentations. If response rate 
was very low, the pigeon was considered to be “quite 
sure” that the reinforced stimulus was not present, 
while if the rate was high, the pigeon was “sure” that 
the reinferced stimulus was present. As in other 
methods, a single ROG curve corresponds to one par- 
ticular stimulus (‘signal’) that differs trom the $+ 
(“noise”). Each point along such a curve 1s based on 
the proportion of trials that yiclded ad elven Tate of 
response to the $+ (the x-coordinate of the point) or 
to the stimulus in question (the y-coordinate of the 
point). When plotted on normalized coordinates, thése 
values produced quite linear ROC curyes (Figure 11), 
but again the curves converged somewhat, rather than 
remaining parallel to the main diagonal. Response 
latency has also been used as 4 wating meéasure in 
animal work (e.g., Yager & Duncan, 1971). 

Even in the absence of data necessary to generate 
complete ROC curves, the sional dataction format, 
with “hits” plotted against “false alarms,” can be a 
convenient way to display response bias and to esti- 
mate its interaction with apparent sensitivity. Terman 
(1970), for example, reinforced rats equally on two 
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Fig. 11. ROC (isosensitivity) functions for a discrimination be- 
tween a wavelength of 582 nm (considered “noise’’) and each 
of several other wavelengths (‘signals’). ‘he curves are pro- 
duced by the “rating” method, where the coordinates of each 
point represent the probability that a given number of response 
(i) or fewer were made to the stimulus in question. When 7 is 
set at successively greater values, a set of points along one curve 
results. (From D. Blough, 1967. © 1967 by the American Associ- 
ation for the Advancement of Science.) 
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Fig. 12. Isobias functions for discriminations between a 100-db 
standard and comparison stimuli from 80 to 100 db. The rats 
differ in their degree of bias, but the amount of bias is rela- 
tively consistent in each. ROC curves cannot be plotted here 
because bias was not varied for fixed signal strengths. (From 
Terman, 1970. @ 1970 by the Society for the Experimental 
Analysis of Behavior, Inc.) 


levers for “correct detections” and “correct rejections,” 
obtaining the “isobias” functions shown in Figure 12. 
Only signal intensity varied here, for there was no 
manipulation of bias upon which ROG curves could 
be based. Nonetheless, it is evident that each of the 
three rats, although run under symmetrical reinforce- 
ment conditions, had a relatively constant bias toward 
one lever or the other. Rat T16, for example, was 
strongly biased toward the right lever, as indicated by 
its high rate of false alarms at weak signal values. De- 
spite such biases, Terman found that the rats gave 
very similar functions when the sensitivity index d’ 
was plotted against stimulus intensity. 

Nevin, one of the first to apply the signal detection 
method to animal data, has suggested that the 
paradigm may clarify the interactions of variables 
other than signal and bias (Nevin, 1970; Nevin, 
Olson, Mandell, & Yarensky, 1975). As we mentioned 
earlier, he reports cases in which d’ appears to vary 
with reinforcement contingencies—despite the com- 
mon assumption that this index is an unbiased index 
of sensory and input factors. Such research suggests 
the question with which we close this chapter: to what 
extent, and in what ways, is signal detection analysis 
of use in animal experimentation? 


The Usefulness of Signal Detection Analysis 
in Animal Work 


Under what circumstances, if any, should a re- 
searcher plan animal experimentation around the use 
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of signal detection analysis? This question raises two 
more specific issues, which we shall consider in turn. 
First, does the interaction of “sensory” and “bias” 
factors that is implicit in the detection model corre- 
spond to the nature of discriminative control in psy- 
chophysical settings, and does signal detection isolate 
a measure of sensitivity that is truly independent of 
nonsensory variables? Second, is this measure practical 
to use, in terms of the time or the number cf observa- 
tions required to estimate it? 

As we suggested above, a major impetus for detec- 
tion analysis in human psychophysics was a failure of 
psychophysical data to correspond to the classical view 
that on any given trial a detection response is con- 
trolled either by sensory input, or by some error fac- 
tor, but not by both. As our review suggests, animal 
data agree with human results in this respect. Animal 
ROC curves clearly suggest that the detection response 
is a joint function of sensory and other variables, as 
proposed in the detection model. For this reason, the 
detection paradigm may be useful to researchers inter- 
ested in clarifying the nature of stimulus control. This 
sort of research is beyond the scope of this chapter, 
but see, for example, Nevin (1967, 1970), D. Blough 
(1967, 1972), Heinemann et al. (1969), and Chase and 
Heinemann (1972). 

The available literature in this area suggests, how- 
ever, that indices like d’ must be interpreted with 
great care. The data of Heinemann et al. (1969), for 
example, indicate that animals may be inattentive to 
the stimulus on a significant proportion of trials. This 
is equivalent to saying that the classical view of detec- 
tion is partly right, for, on some trials, sensory input 
plays no role in controlling the response. ‘Thus a cor- 
rection like the classical correction for false reports 
must be estimated and applied to the data, in effect 
removing the “inattentive” trials from the subsequent 
analysis. 

Available animal data, like most results from 
human subjects, fail to follow the simplest version of 
detection theory, which predicts that ROC curves 
plotted on normalized coordinates will be straight 
lines parallel to the major diagonal. Instead, these 
lines converge in such a way that the sensitivity index 
d’ cannot be determined unambiguously. As we saw 
above, Wright chose to estimate d’ from the point at 
which his ROC curves crossed the minor diagonal; this 
method had the advantage of representing detection 
under conditions of “no bias’—that is, when equal 
numbers of incorrect responses were occurring on 
both keys. Many other solutions to the convergence 
problem may be found in the detection literature. 
Assumptions about the form of underlying distribu- 
tions of sensory events may be altered, or nonpara- 
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metric measures, such as the area under the ROC 
curve, may be employed (cf. Green & Swets, 1966). 

Though various means may be employed within 
the framework of detection theory to deal with the 
convergence of ROC curves, one explanation for the 
phenomenon is rarely considered, for it strikes at a 
fundamental assumption of the theory. This explana- 
tion questions the independence of signal and bias 
variables. In other words, detection theory assumes 
that a given signal will produce a central effect which, 
though it varies statistically from moment to moment, 
is not affected by the variables that control bias, Un- 
fortunately, the correctness of this assumption has not 
been firmly established either for human or animal 
data, and alternatives to it have been suggested (D. 
Blough, 1972; Pike, 1973). 

The work of Nevin and his co-workers (1970, 1975) 
suggests procedural limits on the applicability of the 
detection paradigm. In these studies, the usual rein- 
forcement contingencies produced a good separation 
of sensitivity and bias effects, as implied by the théory. 
However, when the usual correlation of reinforcement 
with stimuli was abandoned, this separation broke 
down. Such was the case, for example, when a “yes” 
response was occasionally reinforced in the presence 
of “noise” as well as in the presence of “signal.” In 
such a circumstance, it is hardly surprising that the 
measure of sensitivity suffers since these ambiguous 
reinforcement contingencies lead to an ambiguous 
definition of the “correct” response. Such work is 
valuable in providing a broader context within which 
the area of applicability of signal detection may be 
identified. 

Even if one assumes that @ is a relatively good 
measure of sensitivity under given conditions, one may 
ask if it is sufficiently better than other measures to 
justify its use. The most accurate way to estimate d’, 
and a necessary procedure for some nonparametric 
indices, is to determine experimentally a large por- 
tion of an ROC curve. This takes a great deal of time. 
It seems that, biased though it may be, percentage 
correct remains an adequate measure for most studies 
having to do with sensory function. If the experi- 
ment can be cast into a forced-choice paradigm, in 
which the two responses are symmetrically related to 
the stimulus (e.g., “push the lever under the brighter 
panel’), and response preference is minimized, d’ and 
percentage correct are virtually equivalent. To quote 
Green and Swets (1966): “If the observer’s preference 
among alternatives is indeed negligible, . a Satis- 
factory index of sensitivity is the percentage of correct 
response, or the value of d’ that corresponds to this 
percentage” (p. 408). 

Even under less favorable circumstances, such as a 


537 


biased yes-no situation, the practical effects of non- 
sensory variables on a threshold estimate may be 
rather small. Irwin and Terman (1970) point out that, 
assuming an animal’s sensitivity (d’) to be constant, 
even a rather strong response bias will yield only a 
relatively small change in percentage correct. For 
example, if a subject incorrectly presses the “no” lever 
five times as often as he incorrectly presses the “‘yes’’ 
lever, his percentage correct for a given signal strength 
will be at most 4%, lower than his percentage correct 
had he no response bias. The impact of this relatively 
small shift will be even less if bias remains relatively 
constant, for it is the difference in threshold across 
conditions, rather than its absolute value, that is 
usually of interest. 

it appears, then, that in operant work concerned 


strictly with sensery processes, the determination of 
ROG curves probably does not add enough informa= 
tion to be worth its cost. Rather than focusing on the 
ROC curve, or on thé d! méasuré, the experimenter 
should concentrate en the minimization ef bias, and 
use the forced-choice procedure if possible. For work 
not strictly sensory, however, the signal detection 
paradism may comprise a convenient framework, 


This framework may further the analysis of discrim- 
inative processes in animals, and animal experiments 
may provide éfiicient means for testing the detection 


model and its assumptions. 
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Operant 


Behavioral Pharmacology 


INTRODUCTION 


Scientific interest in the behavioral actions of drugs 
is not new. For decades investigators have studied 
whether certain drugs could alter behavior in open 
field tests, T-mazes, straight alleys, pole climbing 
apparatuses, shuttle boxes, key pecking situations, and 
lever pressing devices. Often such studies attempted to 
elucidate drug effects on such phenomena as memory, 
learning, anxiety, or drives. ‘Two problems became 
apparent from this body of research: (1) More sensi- 
tive and reliable behavioral procedures were needed 
to assess the behavioral actions of drugs. (2) Many of 
the early questions asked were clearly premature, if 
not misguided, An objective and operationally based 
conceptual framework within which to interpret the 
actions of drugs on behavior had to be formulated be- 
fore such issues could begin to be considered. 

Early infrahuman experiments were given impetus 
in the 1950s when chlorpromazine and reserpine, the 
first tranquilizers, were reported to be useful in treat- 
ing certain psychiatric patients. These findings were 
made in the clinic and came more as a surprise than 
as the logical outcome of planned laboratory research. 
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However, due in large part to these early clinical dis- 
coveries, further interest in laboratory research in be- 
havioral pharmacology took a sharp upswing. In the 
late 1950s and early 1960s, the primary interest cen- 
tered around two problems: (1) How can one discover 
new drugs with useful applications in psychiatry? (2) 
Is it possible to arrive at a more thorough understand- 
ing of these drugs through a laboratory analysis? 

It became clear early on that operant techniques 
were among the most sensitive for measuring the be- 
havioral action of drugs (Boren, 1966; Cook & Kel- 
leher, 1963; Dews & Morse, 1961; Gollub & Brady, 
1965; Sidman, 1959). ‘The profusion of studies using 
operant techniques that followed led to the emergence 
of several journals to accommodate the burgeoning 
literature. In addition, textbooks on drugs and _be- 
havior were written (e.g., Iverson & Iverson, 1975; 
Rech & Moore, 1971; ‘Thompson & Schuster, 1968) 
which involved major emphasis on the experimental 
analysis of operant behavior. 

That operant baselines are sensitive indicators of 
drug action now seems uncontested. However, it is 
one thing to show that a behavioral system is sensitive 
to the manipulation of an independent variable (in 
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this case, drug administration), and it is another to 
show that the resulting data are meaningful and 
significant. One of the major theses of the present 
chapter is that operant behavioral pharmacology has, 
by and large, succeeded in satisfying the two major 
requisites of a scientific domain concerned with the 
analysis of drug actions on behavior: (1) The pro- 
vision of sensitive and reliable behavioral procedures; 
and (2) the provision of an objective, operationally 
based conceptual framework within which to inter- 
pret the results of experiments on the behavicral 
actions of drugs. It is our contention that one of the 
main purposes of operant behavioral pharmacology is 
to interpret the behavioral mechanisms by which these 
behavioral changes are brought about; that is, to ex- 
press the scientific significance of findings concerning 
behavioral actions of drugs in terms of a more general 
set of principles. 

Behavioral pharmacology, which has grown out of 
the integration of experimental psychology and 
pharmacology, is concerned with the behavioral 
actions of drugs. Its primary goal is the description of 
the behavioral mechanisms by which drugs alter be- 
havior. A prerequisite to such a description is an 
understanding of the factors that control the behavior 
in question. To understand the way in which drugs 
alter behavior, it is first necessary to understand the 
factors which control behavior. It follows that a de- 
scription of the interaction of drug variables with be- 
havioral variables is an essential first step in under- 
standing the behavioral mechanisms of drug actions 
(Thompson and Schuster, 1968). In general pharma- 
cology, the ‘“‘mechanism of action” of a drug refers to 
some “basic” process, typically physiological or bio- 
chemical, which mediates a drug’s effect upon a par- 
ticular response. In pharmacology, the term “re- 
sponse’ refers to any change in the organism which 
can be reliably produced by a drug on repeated oc- 
casions. Within behavioral pharmacology such re- 
sponses can be properly related only to behavioral 
mechanisms of action—that is, mechanisms describable 
in terms of basic behavioral processes. 

It 1s important to keep in mind the principle that a 
drug cannot cause a biological system to respond in a 
qualitatively new way. That is, a drug may increase or 
decrease values of dependent variables but may not 
cause a fundamental change in the operation of the 
biological system. As a consequence, we must ask our- 
selves, “With which of the existing systems that regu- 
late behavior is a drug interacting to produce the 
observed behavioral change?” This is the fundamental 
question to which we must address ourselves when we 
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ask, “What is the behavioral mechanism by which 
this drug effect is brought about?’ 


The Problem of Sample Size 


A common feature of the operant approach to be- 
havioral pharmacology involves the intensive study of 
a single individual subject. The emphasis is on close 
observation and firm experimental control. If the 
experiment is successful, a subject wil behave pre- 
dictably from session to session and even from minute 
to minute. Thus, when an effective drug is adminis- 
tered in the middle of a session, a change from the 
dependable baseline behavior should be readily ap- 
parent in an individual subject. Furthermore, on 
different sessions, a range of drug dosages can be 
studied in the same subject with a sound basis for 
comparison. 

The intensive study of the individual subject will 
be emphasized in this chapter and in other discussions 
of operant techniques. This might be labeled the “N 
of One’”’ approach, and has sometimes been misunder- 
stood. ‘he approach is not simply to use a small num- 
ber of subjects for its own sake. For example, someone 
might attempt a drug-behavior experiment with three 
subjects stabilized on a behavioral baseline. The first 
subject is given dose #1, the second subject is given 
dose #2, and the third subject is given dose #3. From 
such an experiment, one can do little more than esti- 
mate crudely the nature of the dose-response curve. 
Differences among subjects will be confounded with 
differences produced by the drug dosage so that one 
cannot tell which is which. A more informative use 
of three subjects would be to administer each dose on 
separate sessions to each of the subjects. Then with 
each individual, one could determine how the differ- 
ent doses modified this subject’s behavior. 

‘There is no simple rule for deciding on the number 
of subjects. The critical question is whether the ex- 
periment can be replicated, and the most straight- 
forward answer comes from successful replications. If 
unknown and/or uncontrolled variables are operating 
in the experiment, replication will be difficult, and 
the experimenter will be aware of trouble with his 
technique. ‘Thus, when the experimenter has reason 
to suspect uncontrolled variables, he is wise to use 
more than one subject. 

There are circumstances where a single subject is 
adequate. Although opinions may differ on this mat- 
ter, the circumstances are approximately as follows: 
(1) The experimenter is working in a well-controlled 
experimental situation with a thorough knowledge of 
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his techniques and his subjects. Thus, the experi- 
menter probably has the problem of uncontrolled 
variables well in hand. (2) The results from the single 
subject are in accord with previous data and fit plau- 
sibly into a well-understood body of knowledge. (3) 
The experimenter has studied other subjects with 
related procedures, and the results are consistent. Sid- 
man’s (1960) more extensive discussion of this matter 
under the topic of “systematic replication” is recom- 
mended to the interested reader. 

Under these circumstances the probability of suc- 
cessful replication is high, so that additional subjects 
are unnecessary. ‘The same amount of experimental 
time might be better spent in studying variations on 
the experiment rather that replicating. By the same 
token, if these circumstances do not hold, the use of 
additional subjects is indicated. 


Why Experiment with Drugs? 


The reasons for selecting a drug for an experi- 
mental study are no less complicated than the reasons 
for investigating any variable which affects behavior. 
However, at least a few of the more common reasons 
can be indicated: 


CURIOSITY ABOUT THE BEHAVIORAL 
EFFECTS OF DRUGS 


How will atropine affect conditioned avoidance be- 
havior? Is behavior maintained by a fixed-—interval 
schedule more sensitive to drugs than behavior main- 
tained by a fixed-ratio schedule? Such questions are 
typical of those which a behavioral phamacologist 
might find interesting. Drugs can be powerful vari- 
ables; they can eliminate behavior altogether or in- 
crease it manifold. Any drug might have an interest- 
ing or unusual action in a behavioral situation, and 
the fact that such a situation occasionally arises may 
be quite enough to maintain the activity of a scientist. 


PRACTICAL UTILITY OF DRUGS IN 
HUMAN AFFAIRS 


‘The use of drugs in treating human ills represents 
the most socially valuable application of pharmacol- 
ogy. As a result, a great deal of research is directed 
toward potential applications of drugs. For example, 
scientists in government and industrial laboratories 
study thousands of newly synthesized organic chem- 
icals every year in the hope that they will discover a 
medically useful drug. There are also a number of 
other practical implications of drug research. In re- 
cent years, for example, increasing attention has been 
given to problems of behaviorally toxic effects of 
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chemicals (Sparber, 1972; Spyker, Sparber, & Gold- 
berg, 1972) administered during gestation or chron- 
ically in the adult animal (Weiss and Laties, 1969). 
Further, the seemingly evergrowing incidence of drug 
abuse has posed an enormous research problem for 
behavioral pharmacologists. 


ANALYSIS AND VERIFICATION 
OF CLINICAL FINDINGS 


A drug which is useful in the clinic is interesting 
to the research worker for several reasons. He may 
wish to understand more fully the mechanism respon- 
sible for the clinical effect than can be done conve- 
niently and without danger in human patients. In the 
laboratory with animal subjects, he can readily per- 
form surgery, administer toxic doses, or implant elec- 
trodes in an effort to understand the drug’s action. 
In another case he may use the drug to determine 
whether the laboratory procedures are relevant to a 
clinical problem. For example, suppose a researcher 
has found a way to disrupt the complex conditioned 
behavior of guinea pigs by injecting an extract from 
the blood of psychotic patients and he wants to know 
if his experimental situation is related to the psychoses 
of human patients. If Drug A is known to affect the 
behavior of psychotic patients favorably, and if Drug 
A also reduced the extract-induced disruption of the 
behavior of the guinea pigs, then the researcher has 
reason to suspect that his behavioral technique might 
be useful in studying anti psychotic agents, such as 
drugs. However, he may be incorrect. Drug A may 
alter the disrupted behavior of the guinea pigs for 
entirely different reasons from those responsible for 
changing the behavior of psychotic patients, and the 
laboratory situation may bear only a superficial re- 
semblance to clinical psychosis. 


ANALYZING BEHAVIORAL PROCESSES 


A drug can occasionally be found which has a cer- 
tain main effect without seriously disrupting second- 
ary effects. This drug can then be used as an analytical 
tool. For example, Dews (1955a) showed that behavior 
maintained by a fixed-interval (FI) schedule of rein- 
forcement was much more sensitive to the effects of 
pentobarbital than was behavior maintained by a 
fixed-ratio (FR) schedule. In a related study, Herrn- 
stem and Morse (1956) examined the effect of pento- 
barbital on similar tandem FI FR performance, where 
the behaviors generated by the two components of the 
schedule were joined in a single performance and 
could not be easily disentangled. When pentobarbital 
was given in a high dosage, the pause characteristic of 
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fixed-interval behavior was sharply changed while the 
fixed-ratio behavior remained unchanged. Thus, the 
drug experimentally separated the two behaviors and 
gave the experimenters additional evidence that the 
complex tandem performance could indeed be prop- 
erly analyzed into the simple components. As another 
example, Thompson and Pickens (1970) compared pat- 
terns of self-administration of stimulants and opiates 
by infrahuman subjects. They found that stimulants 
such as the amphetamines and cocaine were self- 
administered in a highly regular pattern with ex- 
tremely narrow distributions of inter-response times. 
Opiates, on the other hand, engender a bimodal dis- 
tribution of inter-response times which is far Hatter 
and more variable. 


Baseline Stability 


The ideal behavioral baseline should be stable. 
Stable means that the behavior remains about the 
same from one observation period to another (i.e, 
from session to session or from hour to hour). For 
example, if an animal’s lever-pressing rate to avoid a 
shock remained between 9.5 and 10.5 responses per 
minute over 20 sessions, then the behavior would 
surely be regarded as stable because of the low vari- 
ability. One would have considerable confidence that 
the response rate in the next session would remain 
between 9.5 and 10.5 responses per minute. If a drug 
were given prior to this session and the response rate 
went up as little as 20 percent (to 12 responses /min- 
ute), one would still conclude that the drug had in- 
creased the rate because of the clear departure from 
the usual variability of the baseline. 

Note the relation between the degree of stability 
and the magnitude of effect with which the experi- 
menter can work. The greater the stability, the 
smaller the effect which can be reliably studied. For 
example, if the mean of 20 control values is 10.0 
responses/minute and the range is +.] response /min- 
ute (a very stable baseline), then a 10 percent increase 
to 11 responses/minute following a drug injection 
would be considered a reliable effect. On the other 
hand, with the same mean and a range of +5 re- 
sponses/minute (a less stable baseline), a 10 percent 
“increase” above the mean would be well within the 
normal variation. Thus, a drug dose which was in- 
jected before the session would be considered ineffec- 
tive. Statistical tests of significance can be used for a 
more formal analysis of this issue. By any analysis, 
however, greater baseline stability makes for easier 
evaluation of small drug effects. 

Extreme baseline stability can be a mixed blessing. 
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It is sometimes a hallmark of powerful control by 
determining variables which may make it hard to pro- 
duce any departure. Simple escape behavior to an in- 
tense electric shock (e.g., the rat presses a lever to 
turn off a shock) is a case in point. This behavior is 
quite stable and typically occurs less than a second 
from the shock onset. However, a drug must produce 
a massive effect, such as making the rat severely ataxic, 
before the escape responding is reduced. 

Another case where stability assumes secondary im- 
portance involves behavior which is interesting partly 
because it is inappropriate to or fails to meet the cur- 
rent environmental contingencies. Such behaviors are 
not likely to remain stable. However, a drug study of 
individual subjects in such situations is not necessarily 
difficult since the behavior frequently is temporarily 
stable or undergoes slow, systematic change. A drug 
which rapidly brings the behavior under the control 
of the current contingencies (a “therapy” effect) pro- 
vides an interesting outcome. Morse and Herrnstein 
(1956) described a pigeon which was required to peck 
a key 160 times (FR 160) for a food reinforcement. 
For the conditions of the experiment the number of 
responses required was overly large, so that the bird 
sometimes paused half an hour or more before work- 
ing for a reinforcement (a “‘strained” fixed-ratio per- 
formance). Methamphetamine not only greatly in- 
creased the bird’s pecking rate immediately but the 
rate remained high the next day when the drug was 
no longer present. The high rate in the next session 
was presumably due to the unusually large number of 
reinforcements made possible by the drug’s action on 
the previous day. Baseline stability, while it makes 
drug work more convenient and exact, need not be a 
critical consideration. Semi-stable procedures may per- 
mit useful observations which are not possible with 
the more conventional techniques. 


PRINCIPLES OF DRUG ACTION 


While the student of operant behavior is usually 
aware of behavioral factors in designing drug-be- 
havior research, he may be unfamiliar with basic 
pharmacological variables. In attempting an under- 
standing of the principles of pharmacology, it is well 
to keep in mind several basic classes of variables: 


The type of drug 
‘The route of administration 


The relationship between the dose of drug ad- 
ministered and the magnitude of response 


Absorption and distribution 
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E. ‘Time course of drug effects 

F. Distribution 

G. Fate 

H. Multiple administrations of the same drug. 


The Type of Drug 


It would clearly be useful if the many behavioral 
drugs could be arranged on the basis of common 
properties into a small number of categories. Thus, 
we could study, understand, and remember significant 
facts about the small number of categories rather than 
being confused by a mass of particulars. The only 
problem would seem to be the selection of appro- 
priate criteria for the various categories. Unfor- 
tunately, no classification scheme turns out to be 
really satisfactory. For example, one logical choice for 
a Classification criterion might be the mechanism of 
action, such as the locus of action within the CNS and 
the neurotransmitter involved. However, the informa- 
tion on most drugs is much too scanty and speculative 
to permit classification on this basis. Another more 
molar possibility might be to try to classify drugs as 


stimulants or depressants on the basis of whether the | 


drug increases or decreases the response rate on some 
standard schedule of reinforcement. The problem 
with such categories is that drugs usually have more 
complex effects. For example, pentobarbital can in- 
crease fixed-ratio response rates at low doses and de- 
crease rates at high doses (Waller and Morse, 1963). 
Is pentobarbital to be classed a stimulant or a depres- 
sant? Atropine increases the rate in the initial section 
of fixed-interval scallops and decreases the rate in the 
terminal section (Boren and Navarro, 1959). Should 
the atropine effect on the initial section be classed 
stimulant and on the terminal section depressant? Or 
should atropine, based on the entire effect, be called a 
disruptant? These examples illustrate an inherent 
problem with drugs: They have multiple and complex 
effects and they therefore resist classification into any 
one category. 

Drugs are sometimes categorized by chemical struc- 
ture. Classification by structure can be useful when 
several compounds with a similar structure have 
similar effects. The classical case for CNS drugs is the 
barbiturate structure. Amobarbital, pentobarbital, 
secobarbital, etc., are all used to induce sleep, and 
therefore, discussing these compounds as “barbitu- 
rates” is meaningful. However, we sometimes forget 
that drugs used in medical practice are often selected 
in a thoroughly biased way. In the laboratories of 
pharmaceutical manufacturers, organic chemists syn- 
thesize a great many compounds for possible thera- 


OPERANT BEHAVIORAL PHARMACOLOGY 


peutic use, and a common starting point for the 
synthetic program is a known useful (and _ salable) 
drug. Thus, after the success of the first barbiturates 
(barbital and phenobarbital), over two thousand 
variations were synthesized. Furthermore, industrial 
pharmacologists screened the compounds for the abil- 
ity to put mice or rats to sleep. Largely as the result 
of such activity, a number of hypnotic barbiturates 
were made available to the physician and to the 
pharmacologist. However, it would be utterly incor- 
rect to think that all compounds with a barbiturate 
nucleus are hypnotic drugs. Indeed, there are count- 
less barbiturates which are inactive as hypnotics or 
are toxic at hypnotic doses. 

A similar case can be made for the phenothiazines, 
for which chlorpromazine is the model compound. 
After the clinical success of chlorpromazine in treat- 
ing schizophrenic patients, countless phenothiazines 
were synthesized, and screening programs selected out 
the compounds which had chlorpromazine-like ac- 
tions. The selected drugs were then usually tested in 
the same clinical situation for which chlorpromazine 
had been proven useful. As a result of this carefully 
biased selection procedure (and, of course, partly be- 
cause variations of the phenothiazine structure yielded 
compounds with antipsychotic activity), we now have 
a group of structurally similar drugs with clinically 
similar effects. Nevertheless, there are many inactive 
phenothiazines which did not reach the market and 
are hardly known outside drug company laboratories. 

There are other difficulties with classification by 
chemical structure. Compounds with similar effects 
(such as amphetamine and methylphenidate, or 
chlorpromazine and haloperidol) have quite different 
structures. Sometimes two compounds with only 
minor chemical differences (for example, one methyl 
group or one chlorine atom more or less) have sub- 
stantially different pharmacological activity. Further- 
more, the action of certain molecules separately may 
bear no relation to their action when combined. For 
example, the antipsychotic drug, perphenazine, is 
composed largely of joining phenothiazine to piper- 
azine, both of which are used to destroy intestinal 
worms. For reasons such as these, it has not proved 
feasible to classify most drugs by chemical structure. 

The common categorization of drugs is on the basis 
of their therapeutic effect. Table 1 lists a number of 
drugs affecting behavior and groups them in catego- 
ries based upon therapeutic usage. These categories 
are subject to the advantages and disadvantages dis- 
cussed above. However, the table serves to list repre- 
sentative behavioral drugs grouped according to a 
widely used classification. 
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TABLE 1. Representative Behavioral Drugs Classified According to Therapeutic Usage* 


J. ANTIPSYCHOTIC DRUGS 


chlorpromazine (Thorazine) 
triflupromazine (Vesprin) 
trifluoperazine (Stelazine) 
perphenazine (Trilafon) 
fluphenazine (Permitil, Prolixin) 
thioridazine (Mellaril) 
haloperidol (Haldol) 


II. ANTIANXIETY DRUGS 


meprobamate (Miltown, Equanil) 
chlordiazepoxide (Librium) 


diazepam (Valium) 
III. ANTIDEPRESSANT DRUGS 


imipramine (Tofranil) 
amitryptaline (Elavil) 
isocarboxazid (Marplan) 
nialamide (Niamid) 
phenelzine (Nardil) 


IV, sTIMULANTS 


dl-amphetamine (Benzadrinc) 


methamphetamine (Desoxyn, Methedrine) 


methylphenidate (Ritalin) 
magnesium pemoline (Cylert) 
caffeine 


V. HYPNOTICS 


pentobarbital (Nembutal) 
secobarbital (Seconal) 
amobarbital (Amytal) 
phenobarbital (Luminal) 
methaqualone (Quaalude, Sopor) 
chloral hydrate (Somnos) 


VI. HALLUCINOGENS (PSYCHOTOMIMETICS) 


LSD 
mescaline 
psilocybin 


VIL. ANALGESICS 


morphine 

meperidine (Demerol) 
methadone 

heroin 


VIII, OTHER PSYGHOTROPIG DRUGS 


atropine 

scopolamine 

cocaine 

reserpine 

lithium carbonate 

te trahydrocannabinol 


* The generic or nonproprietary name is listed first, and the trade or proprietary name is listed sec- 


ond in parentheses. 


Routes of Administration * 


The pathway by which a drug is introduced into an 
organism is called route of administration. ‘The more 
commonly used routes can be categorized as either 
oral or parenteral (any route outside the alimentary 
tract). The oral route is often used in infrahuman 
research because it readily allows the administration 
of insoluble or irritating compounds and because it 
facilitates comparison with human studies, which 
typically use the oral route. 

With parenteral administration, the drug is in- 
jected directly into the desired site. Although there 
are other routes for parenteral administration, the 
ones most widely used in infrahuman research are sub- 
cutaneous, intramuscular, intraperitoneal, and intra- 
venous. Since none of the drug can be lost by vomit- 
ing or destroyed by gastrointestinal fluids, the dosage 
is more certain in parenteral administration than in 
oral. In addition, the rate of absorption is usually 
more rapid. 


* This section is based largely on Chapter 2 of ‘Thompson 
and Schuster, 1968. 


In subcutaneous (SC) injections the tip of the 
needle is inserted immediately under the skin where 
a solution is expelled. Intramuscular (IM) injections 
are accomplished by inserting the needle deep into a 
muscle mass and expelling a solution or suspension. 
Intraperitoneal (IP) administration is perhaps the 
most commonly used route in rats. The needle is in- 
serted directly into the peritoneal cavity, providing 
rapid drug absorption. Intravenous (IV) administra- 
tion is used when immediate action and maximal cer- 
tainty of dosage is required. The tip and shaft of the 
needle are inserted into the lumen of a vein, and the 
drug is directly expelled into the vein. Generally only 
aqueous solutions, which will not damage blood and 
its constituents or produce local vascular irritation 
may be used. 


The Dose-Response Relationship 


According to a common notion, every drug has a 
“just right” dose which is standard, appropriate, and 
physiologically active. This notion may have some 
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justification where a specific effect of a drug is desired. 
For example, when a physician must treat a patient 
for an acute bacterial infection, he will probably ad- 
minister a “‘standard’”’ dose of an antibacterial drug. 
On the basis of extensive experience he has probably 
selected this dose as being large enough to have a 
therapeutic effect on most patients and small enough 
to avoid undesirable side effects. Even in this re- 
stricted situation, however, the notion of the “just 
right” dose may be inappropriate where the dose hap- 
pens to be too low for a particular infection or too 
large for a particular patient who is sensitive to the 
toxic side effects. 

In a more general sense, the concept of the “just 
right” dose is quite misleading. It overlooks the fact 
that drugs, like most variables, can be applied at 
different levels under varying conditions to produce 
different effects. At the extreme ends of the dose 
range, every drug has a dose which is so low that it is 
ineffective and one which is so high that it is lethal. 
Between these two extremes are dose levels which are 
generally appropriate for a pharmacological study. 

An observation fundamental to all pharmacology is 
the quantity of drug administered is related in an 
orderly way to the magnitude of the effect produced. 
The relation between the dose and the magnitude of 
effect is called the dose-response (or dose-effect) rela- 
tionship. An example is shown in Figure 1. The data 
are from an experiment by Waller and Morse (1963) 
and show how pentobarbital affects two pigeons’ key- 
pecking rates. The pecking response was maintained 
by reinforcement on a fixed-ratio (FR 30) schedule 
(i.e., every thirtieth peck produced grain). The dose- 
effect curves for both birds show an orderly increase 
in the response rate as a result of the intramuscular 
injection of 2 and 3 mg (total dose per pigeon). Three 
mg seemed to produce the maximum rate while the 
largest dose (5.6 mg) substantially decreased the re- 
sponse output. The dose-effect curve, taken as a whole, 
shows how pentobarbital over an effective dosage 
range quantitatively affects the FR 30 response rate. 

Why is it important to determine a dose-response 
curve for a drug? Perhaps the major reason is that 
drugs often have different effects at different doses. 
Therefore, full knowledge of a drug’s effects can be 
attained only if a full dosage range is studied. In the 
data shown in Figure I, pentobarbital both increased 
and decreased the response rate—depending on the 
dose. As a general rule, any drug which will increase 
behavioral output at some intermediate dose will 
surely decrease output at some higher dose. The de- 
crease will occur, if for no other reason, because a 
toxic effect can always be produced by excessively 
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Fig. 1. The effects of pentobarbital on rate of responding on 
FR 30. The dose is plotted on a log scale while the rate is 
plotted on a linear scale. The dose-effect curves of two indi- 
vidual pigeons (B-2 and B-10) are shown. The points above ‘‘S” 
show the rate after saline injections (mean of two observations). 
The points above “C” show the non-injection control rates 
(mean of six to eight observations). The points for each dosage 
are the mean of two observations. The dosage is given in terms 
of number of mg injected. Since the birds weighed slightly more 
than 400 g, the dosage in mg/kg can be readily calculated (From 
Waller & Morse, 1963. @ 1963 by the Society for the Experimen- 
tal Analysis of Behavior, Inc.) 
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high dosages. The selection of one particular dose 
does not permit a valid statement of what the drug 
does. To return to the example of Figure 1, if one 
selected 3 mg of pentobarbital to study, one would be 
convinced that the drug was a “stimulant” which 
increased response output. If one chose 6 mg, one 
would be equally sure that the drug was a “depres- 
sant” which decreased response output. If one picked 
01 mg, the drug would be classed inactive, and if one 
picked 100 mg, it would appear to be a deadly toxin. 

Even if two dosage levels were studied, the con- 
clusion might be misleading. For example, one might 
choose two dosages, one on each side of a maximal 
level, and then find that they had almost the same 
effect. A logical but erroneous conclusion might be 
that increased dosage levels of this drug do not cause 
a greater effect and that the dose-response curve is 
relatively flat. Furthermore, it would be easy to select 
two doses from the left or the right side of the max- 
imum. Thus, one might conclude that larger doses 
either increase the behavioral output or decrease it. 
One can guard against such conditions only by study- 
ing a number of doses distributed over the effective 
dosage range of the drug. 


Travis Thompson and John J. Boren 


As another example of confirmation by other doses, 
suppose that the results from a low dose suggested 
ambiguously that the subject’s ability to discriminate 
between a steady light and a flashing light had de- 
teriorated slightly. To clarify this matter, the investi- 
gator might consider two alternative procedures. One 
is to study the low dose again to see if the small effect 
on the discrimination could be observed. If so, he 
would have greater confidence in the effect. In all 
probability, however, the effect will be as ambiguous 
as before (within the range of extreme control effects), 
so that a definite conclusion may still be difficult. A 
second alternative is to increase the dose (as one 
would normally do in determining a dose-response 
relationship) to determine whether the effect on the 
discrimination is increased. If the effect now emerges 
as large and clear, the investigator’s confidence in the 
effect increases a great deal. This principle of experi- 
mental design is not limited to drug-behavior experi- 
ments, In the study of any independent variable 
where the effects are small, it is often more efficient to 
enlarge the effect by intensifying the variable than 
simply by replicating the small effect. 

A further use of the dose-response curve is in the 
quantitative comparison of two or more drugs. The 
pharmacologist will frequently want to know which of 
two drugs is the more potent, or synonymously, which 
is the more active. In other words, he wants to know 
which drug produces a given effect at the smaller 
dosage. The first step is to determine a dose-response 
curve for each drug. Figure 2 illustrates several pos- 
sible outcomes (assumed values) for Drugs A, B, and 
CG. The figure might, for example, represent the de- 
creases in the rate of avoidance responding produced 


by three depressant drugs. Comparison of Drugs A’ 


and B is easy. Drug A is clearly more potent than 
Drug B in the sense that equivalent effects are pro- 
duced at lower doses by Drug A. The comparison is 
easy partly because the effective dosage ranges over- 
lap very little but largely because the dose-effect 
curves are parallel. The parallel feature permits one 
to reach the same conclusion about potency regardless 
of the size of the effect. In Figure 2, Drug A is about 
eight times more potent than Drug B, regardless of 
whether the comparison is based on a 50 percent 
effect, a 25 percent effect, etc. Furthermore, because 
of the parallel curves, it is possible to calculate a 
single value for each drug which represents its po- 
tency. 

The comparison of Drugs A and C is considerably 
more difficult. The dose-response curves are not paral- 
lel, and the effective dose ranges overlap. ‘To be more 
specific, Drug C appears to be more potent at 1 and 
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2 mg/kg and less potent at 8 and 16 mg/kg. In such 
cases expression of potency in terms of the median 
effective dose (EDs) of the two drugs is arbitrary and 
misleading. Statistical or computational devices do 
not help since the ambiguity is inherent in the data. 
The most convenient solution is to end the search for 
a single number which relates the potency of each 
drug and simply to recognize the characteristics of 
the two dose-response curves. 

Note the likelihood of error if Drugs A and C were 
to be compared at a single dose of each instead of the 
full dose range. If 1 mg/kg were used, one would def- 
nitely conclude that Drug C was more potent; if 16 
mg/koe were used, one would conclude equally defi- 
nitely that Drug A was more potent. ‘To gain com- 
plete and accurate information about a drug there 1s 
no substitute for a study which determines a full dose- 
response curve. 


Absorption and Distributian 


The amount of drug at the site of action determines 
the effect produced. Once a drug has been introduced 
into an organism, it 1s absorbed and distributed to 
many parts of the body, including the site of action. 
The amount of drug reaching the site of action 1s pri- 
marily dependent on the amount administered, its 
physical state, the character of the membranes the 
drug must pass, and the route that if must take to get 
from the site of administration to the site of actien. 
The time taken to get from the site of administration 
to the site of action 1s largely determined by the rate 
of absorption. 

The absorption rate, in turn, 1s primarily deter- 
mined by the route of administration, the physical 
properties of the drug preparation, and the rate of 
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Fig. 2. Three possible dose-effect curves illustrating the compari- 
son of potency. 
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administration. These are the manipulable factors 
that determine how rapidly a drug reaches the site of 
action. The absolute absorption rate can be expressed 
in terms of the change in concentration of the drug at 
the site of application over time. Figure 3 illustrates 
how theoretical absorption and excretion curves pro- 
duce the concentration curve of the amount of drug 
at the site of action. This theoretical curve is modi- 
fied by a set of variables, not all of which are readily 
controllable. ‘The absolute rate of drug absorption 
from the site of administration can be considered a 
physicochemical relation between the drug and the 
transporting medium. The transporting medium 
(blood) is the primary factor regulating the absolute 
absorption rate. 

Drugs tend to move from sites of high concentra- 
tion to areas of lower drug concentration. However, 
movement of a drug from the site of administration 
along a concentration gradient very seldom limits ab- 
sorption rate. Most frequently, the amount of blood 
flowing through tissue determines how rapidly a drug 
will be absorbed from surrounding tissue. Therefore, 
anything that modifies circulation—exercise, tempera- 
ture, presence of other drugs—also alters rate of ab- 
sorption. In intravenous administration, of course, 
absorption is not a limiting factor, and blood levels 
of a drug reach their maximal concentration immedi- 
ately. Figure 4 presents the comparative durations of 
action curves for intramuscular, intravenous, sub- 
cutaneous, and oral routes of administration, in which 
serum concentrations of penicillin were determined 
for various periods following administration. Clearly, 
intramuscular administration most closely approx- 
imates intravenous administration, while oral admin- 
istration and subcutaneous routes, though very slow, 
maintain serum drug levels for a longer period. Al- 
though intraperitoneal administration is associated 
with a duration curve similar to that obtained with 
the intramuscular route, it provides a slower but 
longer lasting peak drug concentration. 


Time Course of Drug Effects 


Many behavioral variables can be applied and re- 
moved almost instantaneously. A light can be turned 
on and off; a shock can be delivered and removed. 
Such events are public and easily observed so the 
experimenter knows when the variable is present and 
at what intensity. ‘The situation is not as simple with 
a drug. Although the experimenter knows he has in- 
jected an animal with a drug, he does not know in 
advance when the drug will take effect, the drug con- 
centration at the site of action, or how long the effect 
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will last. For this sort of information concerning the 
drug’s time course, the basic source is generally ex- 
perimental and determined individually with each 
behavioral preparation. 

Every drug has its own time course. As shown in 
Figure 4, there is a delay in onset of action, (with ip. 
and i.m. routes) an increasing effect up to a peak, and 
finally a decreasing effect until the predrug state is 
again reached, where the drug has been matabolized 
or excreted. To illustrate such a relation, consider the 
study by Grove and Thompson (1970) dealing with 
the effects of pentobarbital on food-reinforced FR 120 
schedule performance by rats. The top frame in Figure 
5 shows baseline FR 120 performance on the left side, 
followed by the effects of a saline injection on the right 
side. A pause occurred, even after saline, as indicated 
at C on the cumulative record. The three sets of 
cumulative records below are labelled 5, 10, and 20, 
corresponding to the dosage in milligrams/kilogram. 
In general, pentobarbital suppressed the overall re- 
sponse rate by successively increasing pausing. At 5 
mg/kg the drug increased pausing immediately after 
reinforcement; in addition, a series of pauses and bursts 


_of responding occurred in later ratios (e.g., at E). At 10 


mg/kg a pause in responding of 45 min duration 


MAXIMUM 


7 


a Resultant Concentration — 
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Fig. 3. Theoretical absorption and excretion curves, yielding a 
curve of concentration at the site of action. Although sym- 
metrical curves of this type are theoretically possible, almost 
all decay curves are hyperbolic and bear little resemblance to 
the absorption curves. Similarly, the resultant curve of the 
actual amount reaching the site of action is never symmetrical 
and is modified by numerous factors, as discussed in the text 
(From Marsh, D. F., Outline of Fundamental Pharmacology, 
1950. Courtesy of Charles C Thomas, Publisher, Springfield, TL) 
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occurred following the first reinforcement. ‘Then about 
50 responses were completed at an intermediate rate of 
about 30 responses per min. Responding was almost 
completely suppressed by the 20 mg/kg dose with slight 
responding resuming toward the end of the session. As 
can be seen from the foregoing example, the duration of 
action of a drug can be indicated by a period of dis- 
ruption or alteration of the ongoing baseline perfor- 
mance. In this case the length of disruption and the 
dosage of the drug were closely correlated. 


Distribution 


When the drug concentration in the blood equals 
the concentration at the. site of administration, ab- 
sorption is said to be complete. This does not, how- 
ever, imply that the drug has been equally distributed 
to all tissues of the body. The factors determining 
differential distribution are poorly understood, but 
some variables are known to be important. Drug 
molecules vary greatly in size, from methanol, with a 
molecular weight of 32, to some of the biological 
macromolecules with molecular weights of up to 
4 x 107 (Bernal, 1958). Obviously such variability in 
molecular size is reflected in differential rates of dis- 
tribution. ; 

The solubility properties of the drug comprise an- 
other factor known to alter distribution. For example, 
thiobarbiturates are very fat soluble, and therefore 
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Fig. 4. Intravenous (IV), intramuscular (IM), oral (PO) and 
subcutaneous (SC) routes of administration and serum concen: 
trations of penicillin. Three milligrams of penicillin G per 
kilogram of body weight were admunistered at various times to 
one individual, and the amounts of penicillin activity in the 
serum determined at time intervals. It is readily apparent that 
certain routes of administration greatly influence uptake and 
also elimination. Much drug 1s wasted by some routes, and 
more frequent administration of the drug is nacassary if affactiva 
blood levels are to be maintained (From Marsh, D. F., Outline 


of Fundamental Pharmacelegy, 1980, Gourtesy of Gharles G 
Thomas, Publisher, Springheld, Ii.) 
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Fig. 5. Cumulative records illus- 
trating the effects of saline, 5.0, 
10.0, and 20.0 mg/kg ip of pen- 
tobarbital on food-reinforced 
FR 120 schedule performance. 
The primary effect of pento- 
barbital on FR performance was 
to exaggerate pausing, with 
minimal effects on running rate. 
(From Grove & ‘Thompson, 
1970.) 
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tend to be rapidly distributed in adipose tissue. Other 
compounds tend to have affinities for proteins of 
blood plasma, not on the basis of their solubility but 
because of their protein-binding properties. Finally, 
distribution to tissues depends on the presence and 
concentrations of the same or similar drugs in those 
tissues. Addition of the same drug or of its antag- 
onists may lead to no increase in concentration in 
given tissues if receptor sites for that drug are already 
saturated. 


Fate 


Following absorption, a drug may undergo trans- 
formation in the body and be ultimately excreted 
either unchanged or as a biotransformation product. 
The biotransformations that a drug undergoes and 
the mechanism of its excretion are referred to as the 
fate of a drug. Figure 3 presented the theoretical ex- 
cretion curve, revealing the assumption that the mech- 


anisms of drug excretion are diametrically opposite to , 


those of absorption. As a matter of fact, the routes of 
excretion are very seldom the simple inverse of those 
of absorption. It is worthwhile to consider briefly the 
most common routes of excretion; the kidney, the 
lungs, the skin, the bile duct, and the intestines. 

Volatile agents, such as the anesthetic gases and 
alcohol, are excreted across the pulmonary membrane. 
We are all well aware that sodium chloride is ex- 
creted in part across the skin; however, few other 
compounds of significance are found on the skin sur- 
face. Organic arsenicals are among the drugs ex- 
créted across the bile duct, and certain agents like 
quinine, as well as some sterols, are excreted in the 
feces. ‘The vast majority of drugs are excreted by the 
kidneys. Because of the central role of the kidneys in 
removing drugs from the body, proper functioning of 
these organs is of extreme importance in drug re- 
search. Renal damage may increase a drug’s duration 
of action; it may even have lethal consequences at a 
dosage that would otherwise be well within a toler- 
able range. 

Some drugs, such as the inhalant anesthetics, are 
excreted from the body in unchanged forms. Most 
drugs, however, undergo some chemical changes prior 
to excretion. The transformation of a drug with a 
specific biological action to an inactive form, or to a 
form with different effects, is called biotransformation. 


Multiple Administrations of the Same Drug 


In behavioral pharmacology research, where sev- 
eral replications of a procedure on the same animal 
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are desirable, the investigator would much prefer 
minimal interaction between successive administra- 
tions of the same drug. At times, however, it is found 
that the dose required to produce the same effect 
must be increased on successive administrations. Or, 
when the same dose is repeated, the effect becomes 


smaller with each dose. When this occurs, it is said 


that tolerance has developed. Multiple administra- 
tions of the narcotic analgesics, barbiturates, and 
amphetamines are particularly likely to lead to the 
development of tolerance. Certain other chemically 
related drugs can, when administered in place of the 
original drug, produce a very similar response; at 
times, they can substitute for the original drug. Usu- 
ally, as tolerance develops to the original drug, tol- 
erance also develops to substitute drugs so that suc- 
cessively higher doses of such drugs are also required 
to produce the original effect. Under these conditions, 
it is said that cross tolerance has developed. 

If discontinuing a drug which has been adminis- 
tered repeatedly and regularly precipitates a charac- 
teristic syndrome of illness (often including vomiting, 
diarrhea, convulsions, and even death), the animal has 
become physically dependent on the drug. If an an- 
imal reliably self-administers the drug when provided 
with the opportunity, the term behavioral dependence 
applies. Thus, an organism that is physically depen- 
dent may be behaviorally dependent as well, though 
the converse is not necessarily true. Humans who ex- 
hibit behavioral dependence without physical depen- 
dence are said to be habituated to the drug. 

Another problem arises when a drug is readmin- 
istered before the effects of the previous dose have 
disappeared. When a drug has not been entirely ex- 
creted or has not undergone complete transformation 
before a second dose is administered, cumulation re- 
sults. Such factors as the presence of the necessary 
enzymes to carry out the transformation reaction, the 
normal functioning of the excretory mechanism (e.g., 
excretion by the kidney), or storage can affect the like- 
lihood of cumulation. Under these conditions, the 
concentration of the drug in various tissues and fluid 
compartments progressively increases. In general, if a 
drug is administered repeatedly, a portion is trans- 
formed and excreted, but a certain amount remains. 
Cumulation rate depends on the interval between 
administrations and the dose. A typical cumulative 
effect is illustrated in Figure 6, where data from five 
administrations of the same dose of a drug are pre- 
sented. While partial recovery occurs following each 
dosing, the level following the last administration is 
well above that seen on the first injection. 

Obviously, a major consideration in gauging cu- 
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Fig. 6. The cumulation of the blood concentration of atro- 
lactamide with time when the drug was readministered to a 
dog at 24-hour intervals. Note that on the fourth administra- 
tion an essentially steady state has been achieved (From Histor- 
ical background and general principles of drug action, by ja. 
Wells. In V. A. Drill (Ed.), Pharmacology in Medicine. © 1958 
by McGraw-Hill Book Company, Used with permission.) 


mulative effects is interadministration interval. By 
spacing successive administrations sufhciently far 
apart, it is usually possible to avoid cumulation. How- 
ever, the disappearance of the active compound or its 
metabolites does not necessarily indicate that a drug 
effect may still not exist. For example, the drug may 
cause morphological or biochemical changes in cells 
which may far outlast the presence of the active agent. 
In these cases, cumulative effects are not defined by the 
presence or absence of the drug but by the changes in 
the measured drug effect. 


ANALYZING BEHAVIORAL MECHANISMS 
OF DRUG ACTION 


Specificity of Drug Action 


A fundamental problem in analyzing the ways in 
which drugs alter behavior is to determine the degree 
to which effects observed are specific to the drug and 
to the specific set of conditions investigated. In the 
following pages we will discuss some of the minimal 
conditions that must be satisfied in order to determine 
the specificity of drug action. 

As indicated earlier, typically a small sample size 
will be used in analyzing the actions of drugs in op- 
erant behavioral pharmacology. ‘The emphasis in these 
investigations 1s the reversibility of drug effects on the 
established behavioral baseline. In general, within- 
subject control procedures are used (that is, A B A 
designs will be employed) in which a given variable, 
such as a drug dose or a schedule value, will be ap- 
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plied and removed and repeated. This requires a high 
degree of reproducibility of the behavioral baseline. 
‘There are basically two approaches to the problem of 
the reliability of the baseline. One is the use of the 
A B A design in which the manipulation is repeatedly 
applied to the same baseline. It is sometimes 
difficult to re-establish the baseline once it has 
been shifted, such as under conditions of transition 
states (Sidman, 1960). In this case, another alternative 
is commonly used, involving multiple baselines. “I'wo 
or more schedules may be conditioned under distinc- 
tive stimulus conditions which may be presented 
sequentially (such as in chains or multiple schedules), 
or concurrently (as in concurrent schedules or non- 
reversible options). For example, during one session a 
given performance can be evaluated by presenting the 
discriminative stimulus for that behavior (e.g., FR 40 
on the right lever), while on another session a similar 
performance (FR 40 on the left lever) can be exam- 
ined in the presence of a different stimulus to deter- 
mine whether the measured effect can be replicated. 
Such schedules may be presented repeatedly with high 
reliability of performance during each schedule com- 
ponent. These procedures are not without their diff- 
culties, because there may be interactions among com- 
ponents. 

Assuming an effect of a drug is measured at a given 
dose of compound on a given behavioral baseline, the 
question then arises “Is the effect unique to that dose, 
or does it also occur at other doses?” Hence, as in any 
other area of pharmacology, it is necessary to admin- 
ister at least three doses of the drug in question, 
typically using a logarithmic dosage regimen. A paral- 
lel question is: “Is the observed effect specific to the 
schedule value chosen?” Behavior maintained by rein- 
forcement schedules which generate high response 
rates may be affected differently by a given drug, than 
schedules which generate low rates (cf. Dews, 1955a). 
Hence, it may be necessary to evaluate the effects of a 
drug at more than one schedule value. For example, 
Thompson, Trombley, Luke, and Lott (1970) studied 
effects of morphine on rats responding for food on 
FR 10, FR 20, and FR 40 schedules (Figure 7). Had a 
single schedule been studied (e.g., FR 10), it might 
have been concluded that in the dosage range in- 
vestigated (1 to 6 mg/kg IP) morphine generates a 
rather flat dose-response curve. However, when a dif- 
ferent schedule value is investigated (FR 40) a clearly 
inverse curvilinear function emerges. Hence, it 1s es- 
sential to study both a range of drug dosages and 
a range of schedule parameter values to obtain a 
complete picture of drug action. 

Another useful control procedure involves experi- 
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Fig. 7. Mean overall response rates on FR 10, FR 20, and FR 40 
schedules treated with saline, 1.0, 3.0, and 6.0 mg/kg of mor- 
phine sulfate ip. Each mean is based on three values, and 
ranges are indicated by vertical lines. (From Thompson et al., 
1970.) 


mentally varying the rates of responding, while hold- 
ing the reinforcement density constant. For example, 
subjects might be conditioned on a fixed-ratio sched- 
ule and the mean interreinforcement time is deter- 
mined. ‘Then the same animals would be conditioned 
on a multiple schedule including a variable-interval 
component having the same overall reinforcement 
density but which maintains a lower rate of respond- 
ing (i.e., Mult FR VI). Under such conditions, differ- 
ences might be presumed to be due to some property of 
the performance, such as rates of responding, rather 
than reinforcement density. 

Yet another kind of control procedure may be nec- 
essary to evaluate whether the drug effects are specific 
to the particular behavioral consequence employed. 
For example, behavior maintained by food, water, or 
sexual reinforcement may not be affected in the same 
way by all drugs. Recent research on behavior main- 
tained by electric shock presentation suggests that, at 
times, the nature of the consequence is less important 
than schedule considerations (Kelleher & Morse, 1968; 
see also chapter 7 in this volume). However, other 
research dealing with drug-maintained responding 
suggests that the generalization cannot be applied to 
all reinforcers (Thompson & Pickens, 1972). 
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Finally, it may be that drugs affect two behaviors 
differently because their baseline rates differ. For ex- 
ample, Dews (1958) has shown that amphetamines 
affect schedule-controlled performances differently, not 
so much because the reinforcement schedules differ 
per se, but because different schedules generate differ- 
ent patterns of inter-response times. In one study such 
a necessary control was accomplished by manipulating 
the baseline rate during one component of concurrent 
operants. Cherek and Thompson (1973) studied con- 
current key pecking reinforced by access to food and 
by access to a target bird which could be attacked. In 
an initial study it was found that A® THC, the active 
ingredient in marijuana, had a more marked effect on 
the key pecking maintained by access to an attackable 
object. However, since the baseline rate of the two 
operants differed, it was necessary to complete a final 
manipulation in which the rate of pecking maintained 
by food was driven down by adding a DRL contin- 
gency to the FI food reinforced performance. Even 
when the rates were equated, A®? THC had a much 
more marked effect on the operant reinforced by 
access to an attackable object (Figure 8). 
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Fig. 8. Effects of A1 tetrahydrocannabinol on key pecking main- 
tained by access to food (solid) and a target bird (dashed) in 
three pigeons. Baseline rates of key-pecking maintained by the 
two reinforcers were equated prior to drug administration (From 
Cherek & ‘Thompson, 1973.) 
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Behavioral Mechanisms By Which Drugs May Act 


Arriving at an understanding of the mechanisms 
by which drugs modify behavior is as complex as the 
permutations and combinations of variables which 
interact at any moment in time to engender a par- 
ticular performance. An approach to analyzing this 
quagmire of variables has been suggested elsewhere 
(Thompson and Schuster, 1968). ‘The rate, pattern, 
and form of current operant behavior are determined 
by certain antecedent factors, the current stimulus 
conditions, and the maintaining consequences of be- 
havior. Drugs, as independent variables, interact with 
any or all of these classes of factors to determine the 
particular behavioral outcome. A better grasp of the 
meaning of a particular finding in behavioral pharma- 
cology can be had if one asks, “With which of these 
factors that regulate behavior has the drug inter- 
acted?’ Has the drug altered the deprivation state (an 
antecedent variable), stimulus control (a current stim- 
ulus variable), response topography (a property of the 
response), or the reinforcer or schedule control (a cornl- 
sequent variable)? In the succeeding pages a number 
of studies are discussed within the foregoing frame- 
work. ‘This is intended not to provide an exhaustive 
literature réviéw, but rather an illustrated outline of 
research dealing with a particular class of variables. 


ANTECEDENT VARIABLES 


Among the more important antecedent variables 
which an organism brings to an experimental situa- 
tion are its past history, and its deprivational state 
established by manipulations prior to thé éxpériment. 
Terrace (1963a) demonstrated that pigeons were 
capable of acquiring a discrimination of color and the 
orientation of a line in an “errorless” fashion. An 
“error” was defined as the failure to respond to a stim- 
ulus correlated with reinforcement (an SP) or a re- 
sponse to a stimulus correlated with non-reinforcement 
(an S4). Errorless learning was established by starting 
discrimination training immediately after the response 
to SP had been conditioned, and by progressively re- 
ducing the difference between the S? and the S4 from 
an initial large difference to a relatively small, final 
difference. In a subsequent study Terrace (1963b) 
showed that pigeons that had learned a discrimination 
without errors, which appeared superficially the same 
as that established by the more typical method (in- 
volving the occurrence of many errors), responded in 
a dramatically different way to imipramine and chlor- 
promazine. Chlorpromazine and imipramine disrupted 
the pigeons’ performances on a discrimination be- 
tween vertical and horizontal lines only if the dis- 
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crimination was learned with errors. Birds learning 
the discrimination in the errorless fashion exhibited 
no errors whatever in a dosage range from 1.0 to 17.0 
mg of the two drugs. 

A second class of antecedent variables involves in- 
teractions of drug effects with deprivation states. 
Singh and Manocha (1966) studied interactions of 
deprivation conditions and past reinforcement history 
in determining the effects of chlorpromazine. ‘They 
found the effect of a given dose of chlorpromazine de- 
pended upon both the degree of deprivation and the 
amount of reinforcement history. That is, extensive 
past histories and high deprivation levels attenuated 
the effects of chlorpromazine. Deprivation states can 
be relevant wherein drugs serve as maintaining con- 
sequences as well. Meisch and Thompson (1973) 
studied the effects of food deprivation on lever press- 
ing reinforced by cthanol in rats, Figure 9 shows the 
effect of increasing food deprivation levels en the dis 
position to respond for ethanol. In general, high rates 
of ethanol-reinforced responding were maintained ac 
a function of incréasing food deprivation, which alss 
varied as a function of FR value. Woods, Downs, and 
Villarreal (1973) studied two methods of drug depriva- 
tion: Withdrawal of the drug and administration of 
a drug antagonist. The subjects were physically de- 
pendent rhesus monkeys whe could self-administer a 
narcotic by pressing one lever and receive food by 
pressing a second lever. Both methods of inducing 
opiate withdrawal evoked nearly identical behavioral 
changes on feod-reinferced responding. While feed- 
reinforced behavior was disrupted by both drug 
deprivation procedures, responding on the drug lever 
was increascd by beth mctheds ef inductien. Thus, 
deprivation variables as antecedent procedures can be 
powerful in determining the actions of the drug. 

As discussed earlier, on repeated administration of 
certain drugs, tolerance develops—that is, a higher 
dose of the drug is required to produce the same 
effect. ‘Tetrahydrocannabinol (THC) is one such com- 
pound. McMillan and co-workers (1970, 1971) have 
studied the development of behavioral tolerance to 
THC in pigeons. The degree of past history of ‘THC 
treatment can profoundly effect the degree to which 
the drug suppresses ratio-reinforced responding. AI- 
though a dose of 1.8 mg/kg of A® THC may be sufh- 
cient to totally suppress responding during the first 
3-5 days of administration, after 30 days of adminis- 
tration of gradually increasing doses, pigeons’ re- 
sponse rates following administration of 10 mg/kg 
may be close to normal control values. 

A final class of antecedent variables involves the 
superimposition of a history of classical conditioning 
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Fig. 9. Effect of FR size on responses per 6-hr session. Ordinate: 
responses plotted on a linear scale. Abscissa: FR size plotted on 
a logarithmic scale. Different scales were used on the ordinate. 
Open triangles: mean ethanol responses during food depriva- 
tion (n = 2); filled triangles: mean ethanol responses during food 
satiation (n=3). Open circles: mean water responses during 
food deprivation (n= 4); filled circles: mean water responses 
during food satiation (n=6). The height of the vertical lines 
indicates the range; absence of a vertical line at a particular 
point indicates that the range was within the area occupied by 
the symbol. The results for FR 1 on the left were obtained after 
completing the sequence of increasing FR values. (From Meisch 
& ‘Thompson, 1973.) 


on operant baselines. The widely noted placebo effect 
involves, at least in part, a conditioned effect due to 
environmental variables paired with drug administra- 
tion. Pickens and Crowder (1967) studied the effects of 
a history of amphetamine injections on locomotor 
activity. After several pairings of the amphetamine in- 
jection with increased locomotor activity, the injec- 
tion of saline was capable of producing a similar 
increase in general activity. Goldberg and Schuster 
(1967) studied conditioned suppression by a stimulus 
paired with nalorphine administration in rhesus mon- 
keys physically dependent upon morphine. Physically 
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dependent monkeys were first trained to press a lever 
for food reinforcement on an FR 10 schedule. A tone, 
initially neutral, was aperiodically presented five min- 
utes before the intravenous injection of nalorphine. 
Nalorphine, because it is a morphine antagonist, can 
suddenly produce withdrawal symptoms. After several 
sessions, conditioned suppression of the food rein- 
forced operant was observed during tone presenta- 
tion prior to the administration of nalorphine. In a 
later study, Goldberg, Woods, and Schuster (1969) dem- 
onstrated increases in responding for morphine in- 
jections in physically dependent monkeys when a 
stimulus pair with nalorphine administration was pre- 
sented. Using rhesus monkeys formerly dependent on 
morphine, Goldberg and Schuster (1969) demonstrated 
an increased sensitivity to nalorphine’s operant sup- 
pressing effect as compared with control monkeys hav- 
ing no prior history of morphine exposure. Within 
the dosage range employed, nalorphine injections pro- 
duced hypersensitivity in formerly dependent monkeys 
but not in controls. Such effects were observed to 
occur for as long as 60-120 days of complete abstinence 
from morphine, long after any possible residual effects 
of the drug could have persisted, and after physical 
dependence had disappeared. 


STIMULUS VARIABLES 


Environmental factors altering the stimulus control 
of operant behavior have been extensively studied and 
are described elsewhere in the present volume. The 
precise manner and the degree to which various classes 
of drugs alter or participate in stimulus control is a 
matter of some conjecture. That the drugs are capable 
of altering stimulus control is widely known. For 
example, LSD alters a visual stimulus discrimination 
in pigeons in a dose-dependent fashion (Becker, Ap- 
pel, & Freedman, 1967). LSD is also known to effect 
the shape of a stimulus generalization gradient in rats 
(Dykstra and Appel, 1972). Further, the degree to 
which stimulus control is altered by a drug varies 
with the complexity of the discriminative stimulus 
(Dews, 1955b). However, it is one thing to show that 
a drug produces a dose-dependent change in the de- 
gree of stimulus control and another to describe the 
behavioral mechanism by which such an effect was 
brought about. One attempt at such an account was 
provided by Dykstra and Appel (1972) in which it 
was shown that the shape of a stimulus generalization 
gradient was changed after the administration of LSD. 
The authors noted that a dose of LSD which produces 
a change in the gradient did so only at relatively high 
rates of responding, suggesting that the change pro- 
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duced by the drug was more a rate-dependent effect of 
the drug than an effect specifically on stimulus con- 
trol. 

Laties and Weiss (1966) have suggested that the 
degree to which behavior is affected by drugs de- 
pends to a considerable degree on how much the 
baseline behavior is controlled by exteroceptive as op- 
posed to interoceptive stimuli. They studied per- 
formance of pigeons on fixed-interval schedules in 
the presence of five one-minute discriminative stimul1 
(i.e, a clock condition). Their primary dependent 
variable was the distribution of responses over succes- 
sive minutes of a clock as a function of doses of am- 
phetamine, scopolamine, pentobarbital, chlorproma- 
zine, and promazine. All of the drugs used produced 
substantial changes in the FI response distribution 
when the pigeons had no exteroceptive discriminative 
stimulus correlated with elapsed time. Providing the 
birds with an exteroceptive clock, however, modified 
the response distribution greatly and decreased sensi- 
tivity to amphetamine, scopolamine, and pentobarbi- 
tal, although the sensitivity to chlorpromazine and 
promazine was largely unchanged. These findings sug- 
eest that the source of discriminative stimuli control- 
ling the performance is important in determining the 
reaction to drugs, and further that it is relatively more 
important for some drugs than others. In another 
study (Laties, 1972), pigeons were trained on a chained 
and tandem FR 8 FR 1 reinforcement schedule in 
which eight pecks on one response key were followed 
by a single peck on the second key which produced 
access to grain. If the bird switched keys before the 
count of eight, the series of responses had to be started 
again. During one condition, no external stimulus 
change occurred following the eighth response (1.e., 
tandem condition). During the other condition a stim- 
ulus change invariably occurred following the eighth 
response (i.e., chain condition). The addition of the 
stimulus change made the subjects much more effhi- 
cient in meeting the required minimum count before 
switching to the reinforcement key. That is, the 
chained schedule condition generated more efficient 
performance than the tandem schedule condition. Re- 
sponse rate, however, remained about the same. When 
a discriminative stimulus was not present (i.e., the 
tandem condition), chlorpromazine, d-amphetamine, 
and scopolamine led to premature switching to the 
reinforcement key. The addition of the external dis- 
criminative stimulus attenuated the effects of scopola- 
mine and d-amphetamine most; chlorpromazine and 
promazine least. 

Overton (1971) has studied the discriminative stim- 
ulus properties of an array of drugs using a ‘I’-Maze. 
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On drug sessions animals pretreated with various be- 
haviorally active drugs are reinforced for turning one 
direction in the maze, and on vehicle control days for 
turning the opposite direction. Certain CNS drugs, 
including the barbiturates and minor tranquilizers, 
exercise strong stimulus control, while phenothiazines 
such as chlorpromazine exercise rather weak stimulus 
control. Rats readily learn the correct turn at the 
choice point in a T-Maze when pretreated with a 
barbiturate, alcohol, or chlordiazepoxide, while re- 
quiring a relatively long period to acquire stimulus 
control when pretreated with a phenothiazine deriva- 
tive. Taking this line of research one step further, 
Barry and Kubena (1972) studied discriminative stim- 
ulus characteristics of alcohol, marijuana, and atro- 
pine in rats. In their procedure rats were reinforced 
ona FR 5 schedule of food reinforcement for pressing 
one lever, while if they pressed the other lever they 
received painful foot shock. Under ethyl alcohol, atro- 
pine, and Al THC, rats rapidly learned to press the 
food lever under the appropriate drug or control con- 
dition and to avoid the lever which produced shocks, 
Tests for stimulus generalization were then conducted, 
in which rats trained under one drug condition were 
tested with various doses of the same or other drugs. 
The experimental question was “how similar are the 
stimulus properties of drug X to the drug used in 
training?” Lower doses of the drug used in training 
generally produce intermediate percentages of correct 
responding. However, when other drugs were admin- 
istered, the results were more complex. Animals that 
had been trained under pentobarbital tended to re- 
spond correctly when tested under other depressants, 
such as chlordiazepoxide. Animals trained using atro- 
pine as the discriminative stimulus tended to respond 
correctly when receiving doses of other anticholinergic 
drugs such as scopolomine. ‘The A! THC animals 
responded correctly only when other marijuana ex- 
tracts were administered. If alcohol trained animals 
were administered chlorpromazine or d-amphetamine 
under generalization test conditions, incorrect re- 
sponding tended to occur. When A? THC trained an- 
imals were administered various depressant, stimulant, 
or hallucinogenic drugs, incorrect or control respond- 
ing tended to occur. 

One of the earliest studies using an operant tech- 
nique to analyze the discriminative stimulus proper- 
ties of drugs was conducted by Cook, Davidson, Davis, 
and Kelleher (1960). Dogs, surgically prepared with 
intravenous catheters, were intravenously administered 
various doses of epinephrine, norepinephrine, or 
acetylcholine prior to a painful shock to the dog’s 
limb. ‘The dog could avoid the shock by lifting his 
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limb during the discriminative stimulus period pre- 
ceding the shock. Acetylcholine served as a highly 
effective discriminative stimulus, while l-epinephrine 
served as a relatively weak discriminative stimulus, 
taking approximately twice as long to establish con- 
trol to a criterion of 100% correct responding. 

One of the more elegant studies of discriminative 
control of operant behavior by a drug was conducted 
by Schuster and Brady (1964). In that study rhesus 
monkeys, also surgically prepared with an intravenous 
catheter, were infused with various doses of epineph- 
rine in the presence of which lever pressing was 
reinforced on a fixed-ratio schedule of food presenta- 
tion. When saline was injected, lever pressing had no 
consequence. ‘he independent variable was the dos- 
age of epinephrine administered, and the dependent 
variable was the percentage of epinephrine and saline 
control trials during which the subject met the re- 
sponse requirements for reinforcement. Figure 10 
shows the acquisition curves. By one animal’s 25th 
session, epinephrine exercised substantial stimulus 
control over the animal’s behavior. 

Harris and Balster (1971) first trained rats on mul- 
tiple and mixed reinforcement schedules using ex- 
teroceptive discriminative stimuli. Subsequently, the 
same schedules were used in training with various 
drugs as discriminative stimuli instead of exterocep- 
tive stimuli. Their findings essentially corroborated 
those of Overton, in that depressants such as chlordi- 
azepoxide and ethyl alcohol were capable of establish- 
ing strong stimulus control (in this case over multiple 
schedule performance), whereas phenothiazine deriva- 
tives exercised little or no stimulus control. Similarly, 
hallucinogens such as psilocybin and LSD were very 
ineffective as discriminative stimuli. 


CONSEQUENCE VARIABLES 


I'ype of consequence: Positive reinforcement. The 
types of consequences of behavior and the schedules 
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according to which they are presented appear to be 
fundamental determinants of behavior (Morse and 
Kelleher, 1970). Hence, one would expect that the 
effects of drugs would depend critically upon how the 
drugs interact with consequences and their schedules 
of presentation. Dews and Morse (1961) and Kelleher 
and Morse (1969) have argued the type of consequence 
is a relatively unimportant factor in determining the 
behavioral actions of drugs, whereas the schedule ac- 
cording to which various types of consequences are 
presented is of primary importance. There are several 
lines of evidence supporting this thesis. 

Laties and Weiss (1963) studied the effects of am- 
phetamine, chlorpromazine, and pentobarbital on the 
behavioral regulation of temperature. ‘The rats, after 
being placed in a cold compartment, were trained to 
warm themselves by pressing a lever that turned on a 
heat lamp. Amphetamine, at a dose level that by itself 
did not increase the rate at which body temperature 
fell in the cold, increased the frequency with which 


‘the rats turned on the lamp even though the skin 


temperature was driven above normal. Chlorproma- 
zine, at a dose level that accelerated heat loss in the 
cold, decreased the frequency with which the lamp 
was turned on. Pentobarbital produced only a transient 
depression directly correlated with base level. Be- 
havioral thermoregulation was impaired by both am- 
phetamine and chlorpromazine, the former by increas- 
ing and the latter by decreasing the optimal frequency 
of bursts of heat. 

Waller and Waller (1962) studied the effects of 
chlorpromazine on behavior maintained by food rein- 
forcement and shock avoidance in a multiple rein- 
forcement schedule. There was no evidence that 
chlorpromazine had a differential effect on avoidance 
behavior or on food-reinforced behavior as a function 
of the type of reinforcer. At low doses, rates of re- 
sponding on the food reinforcement component in- 
creased slightly whereas rates on the avoidance com- 
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ponent remained relatively unchanged. At higher 
doses both components showed an approximately 
equal depression of responding. Kelleher and Morse 
(1964) studied the effects of d-amphetamine sulfate 
and chlorpromazine on rates of responding under 
multiple FI FR food and shock-escape schedules. 
Three squirrel monkeys were studied at each multiple 
schedule. Each drug was given IM immediately before 
the beginning of the 214 hour session. Both amphe- 
tamine and chlorpromazine had similar effects on 
behavior maintained on a given schedule regardless of 
the type of consequence, i.e, whether it was shock 
avoidance or food reinforced. Similarly Cook and 
Catania (1964) studied the effects of drugs on food- 
reinforced and escape behaviors. They studied the 
performance of squirrel monkeys under FI 10-min 
schedules of food reinforcement or electric shock 
termination. In one group of food-deprived monkeys, 
the first key-pressing response after 10 min led imme- 
diately to food presentation. In the second group of 
monkeys, an intermittent electric shock of low 1n- 
tensity was continuously delivered to the grid Hoor of 
the experimental chamber. The first response alter 10 
min terminated the electric shock. These schedules of 
food presentation and shock termination are formally 
similar, and both engendered patterns of responding 
characteristic of FI schedules. With both types of 
reinforcers, chlordiazepoxide, meprobamate, imipra- 
mine, and chlorpromazine had similar effects. All of 
the foregoing data would seem to argue that the pre- 
cise consequence of the behavior may be less impor- 
tant in determining the drug effect. 

The foregoing remarks have to be modified under 
certain circumstances. For example, animals physically 
dependent on morphine derivatives and treated with 
antagonist drugs such as nalorphine or naloxone re- 
spond differently from animals that are not physically 
dependent. Naloxone and nalorphine have entirely 
different effects on morphine-reinforced responding in 
animals than on food-maintained responding. Simi- 
larly, Jacobs (1958) studied the effect of exogenous 
insulin on the choice between a 10% glucose solution 
and a 35% solution. The 10% solution was typically 
preferred by untreated rats. However, when insulin 
was administered prior to choice testing, the prefer- 
ence shifted to the 35% solution. That is, the more 
concentrated glucose solution was shown to be a more 
powerful reinforcer depending upon the insulin in- 
jection. Under such circumstances the type of conse- 
quence is of critical importance. 

Another class of reinforcers which has received par- 
ticular attention in behavioral pharmacology has 
been drugs themselves. Spragg (1940) and Masserman 
and Yum (1946), Headlee, Coppock, and Nichols 
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(1955), and Beach (1957) were among the first to pre- 
sent experimental evidence that drugs could serve as 
reinforcers. These early studies set the stage for later 
experiments demonstrating more conclusively the rein- 
forcing properties of an array of compounds. This 
literature has been reviewed recently, and the array of 
drugs that serves as reinforcing consequences 1s ap- 
proximately the same as those that are associated with 
drug dependence in man (Schuster and Thompson, 
1969; Thompson and Pickens, 1969; 1970). Among the 
drugs that have been shown to be self-administered 
are the narcotic analgesics, barbiturates, certain cen- 
tral nervous system stimulants such as amphetamine, 
cocaine, and cannabis, and some hallucinogens. 

The development which was of critical importance 
in fostering research on drug reinforcement was de- 
velopment of the technology necessary to permit 
chronic intravenous injections of drug solutions in 
unrestrained or partially restrained animals (Pickens 
& Thompson, 1975: Schuster & Thompson, 1969), The 
intravenous route is especially important because it 
minimizes the delay of reinforcement between the oc- 
currence of the operant and the onset of drug effect. 

The fact that drugs serve as reinforcers fer imfra- 
human subjects may be viewed by some to be an inter- 
esting, if somewhat curious, finding. However, the 
skeptical reader may ask why the reinforcing proper- 
ties of drugs are of any general interest to those in the 
field of operant conditioning. There are three rea- 
sons, two practical and one théorétical. First, the basic 
processes which are involved in drug reinforcement in 
infrahuman subjects may be functionally comparable 
to those in humans who use and abuse drugs. If so, our 
approach to the problems associated with human 
drug dependence changes dramatically, Nearly all 
drugs which are commonly abused by humans are self- 
administered and serve as effective reinforcers for 
infrahuman subjects (Schuster and Thompson, 1969; 
Thompson and Pickens, 1969). This suggests that the 
principles and knowledge concerning the control of 
operant behavior by other reinforcers can now be 
brought to bear in the human situation, and so lead 
to a better understanding of the controlling variables 
of drug-maintained behavior in man. 

A second and related reason why drug reinforce- 
ment is an interesting phenomenon is the possibility 
that the infrahuman drug self-administration labora- 
tory may serve as a testing ground for future abuse 
potential of drugs introduced for human therapeutic 
purposes. As indicated above, thus far, all drugs which 
have been tested in infrahuman laboratory and which 
are actively self-administered by infrahuman subjects 
(e.g., rats and/or monkeys) are also known to be com- 
monly abused by man. In the years to come, as new 
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compounds are manufactured and about to be intro- 
duced into the clinic, testing in infrahuman drug 
self-administration laboratories may prove to be use- 
ful in predicting which compounds have the highest 
abuse potential and should be subject to special regu- 
lation. 

A third reason for being interested in the drug 
reinforcement phenomena has to do with understand- 
ing basic mechanisms controlling behavior. Drugs as 
reinforcers have certain unusual, if not unique prop- 
erties, which permit them to be used to study rein- 
forcement phenomena difficult to study using other 
reinforcers. It is easy to obtain infrahuman subjects 
with no previous experience of drug reinforcers, so 
that interpretation of effects is not complicated by an 
unknown past history. In addition, infrahuman drug 
self-administration provides us with a way of studying 
mechanisms controlling behavior in the laboratory, 
including biochemical and physiological mechanisms, 
as well as an array of environmental variables, includ- 
ing reinforcement contingencies, stimulus control, and 
other classes of variables discussed throughout the 
present volume. 

When research on drug self-administration by in- 
frahuman subjects was first initiated, the main ques- 
tions that were asked dealt with the types of drugs 
that served as reinforcers. It is now clear that it is the 
rule rather than the exception that drugs serve as 
powerful primary reinforcers for most infrahuman 
subjects. Now the questions that must be answered in- 
clude the following: (1) Under what conditions do 
various classes of compounds serve as reinforcers and 
gain the greatest control over behavior? (2) To what 
extent do drug reinforcers have the same properties as 
other reinforcers? For example, do drugs lead to the 
establishment of schedule control in much the same 
fashion as other reinforcers (Thompson and Pickens, 
1969)? ‘To what extent do manipulations of reinforce- 
ment magnitude yield effects similar to the magnitude 
of other reinforcers such as food, water, or brain stim- 
ulation? Do histories of intermittent reinforcement or 
reinforcement with certain types of drugs affect the 
resistence to extinction? Similarly, do all drugs, con- 
trolling for the number of reinforcements, generate 
the same degree of resistance to extinction? What is 
the role of deprivation conditions in determining the 
control of behavior by various classes of drug reinforc- 
ers (e.g., those which produce physical dependence 
such as morphine, as contrasted with those which do 
not produce physical dependence such as cocaine)? To 
what degree are behaviors maintained by drug rein- 
forcement subject to stimulus control in the same way 
as behaviors maintained by other reinforcers? Finally, 


OPERANT BEHAVIORAL PHARMACOLOGY 


what is the role of conditioned reinforcement in the 
overall control of drug-maintained responding? Once 
answers can be provided for the foregoing questions, 
it may be possible to begin to alter the degree to 
which drugs as reinforcers control behavior. In other 
words, solutions to problems of drug dependence may 
depend, to a significant degree, upon an understand- 
ing of the basic mechanisms by which drug reinforce- 
ment operates. 


Punished Responding. Morse (1964) studied the 
effects of amobarbital and chlorpromazine on pun- 
ished behavior of pigeons. Key pecking which was 
maintained by a variable-interval schedule of food 
reinforcement was also punished by brief electric 
shocks. Under this simultaneous food reinforcement 
and shock punishment schedule, responding was de- 
pressed to a low and fairly uniform rate that was in- 
versely related to punishment intensity. Morse found 
that amobarbital partially restored responding sup- 
pressed by punishment, while chlorpromazine had no 
tendency to attenuate the suppressing effects of pun- 
ishment. Other investigators using a variety of species, 
including the rat and monkey, have shown similar 
effects—namely that barbiturates and minor tranquiliz- 
ers generally decrease the suppressing effects of punish- 
ment while amphetamines, morphine, chlorpromazine, 
and trifluoperazine usually do not attenuate the sup- 
pressing effects of punishing shock (Geller and Seifter, 
1960, 1962; Kelleher and Morse, 1964, 1968: Morse, 
1964; Wuttke and Kelleher, 1970). McMillan (1973) at- 
tempted to discover the mechanism by which various 
drugs attenuate the effects of punishment. The effects 
of a variety of compounds on key pecking responses 
punished by electric shock in a multiple FI 5 FI 5 
punishment schedule were investigated using pigeons. 
Most of the drugs studied increased low rates of both 
punished and unpunished responses, while increasing 
higher rates or decreasing them. However, low rates 
of punished responding were sometimes increased 
more by pentobarbital, diazepam, and chlordiazepox- 
ide than were matched rates of unpunished respond- 
ing. In contrast d-amphetamine and chlorpromazine 
usually increased low rates of unpunished responding 
more than matched rates of punished responding. 
‘Thus, the effects of drugs on punished responding 
appeared to depend upon the control rate of punished 
responding; however, the rate-dependent effect of 
drugs on punished responding is not always the same 
as for unpunished responding. 

A series of experiments conducted by Morse and 
Kelleher (1970) and McKearney (1972) have cast con- 
siderable doubt upon what has commonly been termed 
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a “motivational” interpretation of the effects of drugs 
on operant behavior. In these experiments, squirrel 
monkeys were trained on schedules of electric shock 
presentation. Monkeys exposed to various reinforce- 
ment histories (typically unsignaled shock avoidance, 
but sometimes shock elicitation or variable-interval 
food-reinforced responding) were exposed to response- 
contingent painful electric shock. Under these condi- 
tions, after sufficient exposure to the schedule, 
lever-pressing performance stabilized and typical fixed- 
interval performance emerged. ‘That is, a shock which 
under other certain circumstances would serve as an 
effective punisher appears to be maintaining behavior. 
Morse and Kelleher (1970) have explored the implica- 
tions of schedules of shock presentation for a general 
understanding of the concept of reinforcement and 
reinforcement schedules. It has generally been found 
that drugs have effects on responding maintained by 
schedules of electric shock presentation that are in- 
distinguishable from those on schedules of fooed-rein- 
forced responding or water-maintained responding. 
This striking result was not only unexpected but is 
obviously incompatible with any simple notion one 
might have about “tranquilizing” or ‘anxiety reduc- 
ing” effects of drugs as being the primary basis for 
determining their behavioral actions. One would as- 
sume that the motivational state associated with a 
schedule of shock presentation would not be at all 
like that under a schedule of food or water presenta- 
tion. 


SCHEDULE CONSIDERATIONS 


The last class of variables which will be considered 
in the present context concerns the schedules accord- 
ing to which response consequences are presented. 
Dews (1955a) studied differential sensitivity to pento- 
barbital of key-pecking performance by pigeons using 
fixed-interval and fixed-ratio reinforcement schedules. 
Dews found that a dosage of pentobarbital which had 
a rate increasing effect on fixed-ratio 50 performance 
produced a markedly rate reducing effect on FI per- 
formance. Although Dews employed either an FI or 
an FR schedule throughout an entire session, other 
investigators have used multiple schedules, in which 
FI and FR components occurred randomly through- 
out each session (Morse and Herrnstein, 1956). With 
a multiple schedule the same subject can be studied 
under drug conditions in which responding is main- 
tained by the same reinforcer but based upon differ- 
ent reinforcement schedules within a single session. 
Thus, differential effects of a given drug on these 
alternating patterns of responding can hardly be at- 


559 


tributed to changes related to the reinforcer, but 
rather must be interpreted in terms of the reinforce- 
ment schedules. 

In another study, Dews (1958) demonstrated sched- 
ule-dependent effects using methamphetamine. ‘The 
number of responses made by pigeons in a fixed pe- 
riod of time was greatly increased by methampheta- 
mine when the birds were conditioned using FI 
15-min and FR 900 schedules. However, the rates 
were only slightly increased when the birds were con- 
ditioned using other schedules (VI 1l-min and FR 50). 
The fact that the effect of drugs depends critically on 
the type of schedule and the schedule parameter is 
now widely recognized. For a number of years it was 
thought that the schedule per se was a fundamental 
determinant of the behavioral actions of drugs. How- 
ever, Dews (1958) suggested that perhaps the mecha- 
nism by which schedule-dependent drug effects were 
brought about concérnéd the number and length of 
inter-response times generated by a given schedule. 
That is, schedules which generate short inter-response 
times, such as fixed-ratio schedules, will lead to rate 
decrements following amphetamine administration, 
while schedules which engéndér long interresponse 
times will tend to be associated with rate increases 
following amphetamine administration. This notion. 
which has come to be called the “rate dependency 
hypothesis,” has considerable support involving an 
array of drugs. Kelleher and Morse (1969) have sum- 
marized the findings pertaining to rate dependent 
drug effects as follows: “The net effact of ampheta- 
mines on the average rate of responding under a 
schedule can be analyzed in terms of effects of rates of 
responding in different temporal periods of the sched- 
ule. . . . Whether amphetamine increases or decreases 
responding depends upon the pre-drue rate of re- 
sponding as well as the dose. Evidence . . . indicates 
that pre-drug rates of one response or more per sec- 
ond are only decreased by increasing doses of am- 
phetamine. Pre-drug rates of less than one response 
per second increased to a maximum and then de- 
creased after increasing doses of amphetamine” (Kelle- 
her and Morse, 1969). Rate dependent drug effects 
have been reported for the amphetamines, barbitu- 
rates, minor tranquilizers such as meprobamate and 
chlordiazepoxide and morphine (Kelleher, Fry, Dee- 
gan, & Cook, 1961; Richelle, Xhenseval, Fontaine, & 
Thone, 1962; Smith, 1964; Thompson, et al., 1970). 

Some of the more convincing research done dealing 
with the rate dependency hypothesis involves detailed 
examination of performance within a given simple 
fixed-interval schedule. Smith (1964) studied the effects 
of d-amphetamine (.01-10.0 mg/kg IM) on a FI 5-min 
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performance by pigeons. Specifically, Smith studied 
the effects on response rate during the first and last 
minute of each FI 5-min component of the schedule. 
He found that d-amphetamine markedly increased 
the low rates of responding characteristic of the first 
minute of the schedule and decreased the high rates 
of responding characteristic of the fifth minute of the 
schedule. At a dose of 3 mg/kg the maximum overall 
rate increase was observed. The rate of responding in 
the first minute was significantly increased and was 
significantly lower during the last minute. A dose of 
10 mg/kg decreased overall response rates; however, 
this dose produced a greater increase in rate in the 
first minute than did the 3 mg/kg dose, but also pro- 
duced a more marked decrement in rate during the 
last minute. In short, the change in overall respond- 
ing produced by d-amphetamine was the net result of 
its rate increasing effects early in the interval and its 
rate decreasing effect during the latter portion of the 
fixed interval. 

Although the rate dependency hypothesis appears 
to apply to a considerable range of schedule-controlled 
phenomena, there are several noteworthy exceptions. 
Responding that is maintained at a relatively low 
rate under punishment conditions does not seem to 
follow the rate dependent effect to the same degree as 
responding which is not under the control of punish- 
ment. Responding that is punished and _ therefore 
maintained at a low rate is often further decreased 
by amphetamine (Geller and Seifter, 1960). ‘The rate 
dependency hypothesis would predict an increase in 
these low response rates. Under conditions in which 
the responding has not previously occurred or has no 
programmed consequence, amphetamine may have 
little tendency to enhance responding. Verhave (1958) 
studied bar pressing by untrained rats during 12 daily 
one-hour sessions in which responding had no pro- 
grammed consequence. During the first session, all rats 
responded and the mean number of responses was 
15.7. Over successive sessions, the mean number of re- 
sponses showed an orderly decrement, such that by the 
seventh session the mean response rate was only .8 
responses per hour, and three of the six rats did not 
respond at all. Methamphetamine (2 mg/kg S.C.) was 
administered before the eighth session. The rate de- 
pendency hypothesis would predict increases in these 
low response rates. However, the mean response rate 
remained at .8 responses per hour and four of the six 
subjects did not respond at all. Finally, Dews (1955b) 
studied the effects of methamphetamine on an an- 
imal which received extensive training on a simple 
discrimination, and found that methamphetamine in 
the dosage range .1 to 3 mg/kg IM had no rate in- 
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creasing effect on responding in the presence of visual 
stimulus in which key pecking had no programmed 
consequence. 

Thus, while the rate dependency hypothesis ap- 
pears to apply to a wide array of experimental find- 
ings, there are certain situations in which it does not 
hold. The fact that rate dependency does occur under 
a number of circumstances, does not necessarily in- 
dicate the mechanism by which the rate dependency 
phenomenon is engendered. For example, ampheta- 
mine may attenuate or in some way alter the degree 
of stimulus control (Dews, 1955b; Laties and Weiss, 
1966). In a related study Hearst (1964) has examined 
the effects of d-amphetamine on avoidance responses 
in monkeys. Hearst found that amphetamine flattened 
the generalization gradient, once again suggesting an 


alteration in stimulus control. Hill (1970) has pre- 


sented data suggesting that one of the mechanisms by 
which amphetamines altered behavior is by changing 
reinforcing properties of stimuli paired with uncon- 
ditioned reinforcement. In these investigations Hill 
has presented evidence suggesting that amphetamine 
increases the conditioned reinforcing properties of 
such stimulus events. 


TRADITIONAL PROBLEMS FORMULATED 
WITHIN AN OPERANT FRAMEWORK 


Acquisition and Extinction—Learning 


In the introduction to this chapter it was suggested 
that early research dealing with drugs and behavior 
was sometimes misguided. Investigations designed to 
deal with drug effects on such phenomena as learning, 
motivation (e.g., fear, anger, hunger, etc.), or percep- 
tion were often formulated so that no matter what the 
experimental outcome, it would be impossible to de- 
termine the mechanisms involved. ‘The development 
of behavioral pharmacology over the past decade has 
begun to make it more profitable to ask experimental 
questions pertaining to these very complex issues, For 
example, when one is concerned with the effects of 
drugs on learning, as Dews has pointed out (1970), 
one is interested in more than a trivial change of be- 
havior of a student in a classroom, such as the differ- 
ence between being awake and sleeping during a 
learning task. Instead, operant behavioral pharmacol- 
ogy has focused attention on questions such as “In 
what way do various drugs effect the transition from 
one steady state to another?” 

As an example of an operant approach to the study 
of drug effects on “learning,” Stolerman (1971) studied 
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the acquisition of lever pressing by rats on a contin- 
uous reinforcement schedule under saline, chlorpro- 
mazine, and chlordiazepoxide. He found that while 
both chlorpromazine and chlordiazepoxide reduced 
the rate of acquisition of the lever pressing response, 
there were significant differences both in the degree of 
the reduction and the mechanism underlying the 
differences. Heise and Lilie (1970) studied the effects 
of scopolamine, atropine, and d-amphetamine on 
elimination of responding on nonreinforced trials in a 
discrete-trial situation. Under one set of conditions, an 
exteroceptive stimulus indicated periods when re- 
sponses would xe) unreinforced (1.e., $4) while under 
other conditions there was no external S4. Scopola- 
mine impaired performance (that 1s, reduced the per- 
centage of trial responses that were reinforced) to 
about the same extent when an external stimulus in- 
dicated a non-reinforcement as when the stimulus was 
absent. D-amphetamine, on the other hand, impaired 
performance only when there was no exteroceptive 54, 
Barry and Kubena (1971) studied the effects of THC 
on acquisition of shock ayoidance by rats. They found 
that acquisition of an avoidance response was im- 
proved when performed under the acute effects of 
high daily doses of THC beginning at an early stage 
of training. Facilitation of the acquisition of avoid- 
ance has also been found with various other drugs 
which, in common with THQ, have a predominantly 
behavioral depressant effect, but impair well-estab- 
lished avoidance responses only at very high doses 
(Barry and Buckley, 1966). Meisch (1972) studied the 
development of ethanol as a reinforcer for rats, de- 
scribing a procedure for establishing rapid acquisition 
of ethanol-controlled behavior. It has becn shown 
previously that rats will self-administer ethanol solu- 
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tions when they are substituted for water in a sched- 
ule-induced polydipsia procedure (Meisch & ‘Thomp- 
son, 1971). In Meisch’s experiment, sessions were run 
using a 4-hour polydipsia period preceded by a 2-hour 
period during which concurrent food reinforcement 
was not available. Responding for water, or for 
ethanol on experimental days, could occur at any time 
during the 6-hour session. Little or no responding 
occurred for water during the first 2-hour component, 
while high rates of water-reinforced responding were 
obtained during the subsequent 4-hour period. Fol- 
lowing establishment of this pattern of water-rein- 
forced responding, an 8% ethanol solution was pre- 
sented on days alternating with water control days. 
Using the pattern of water responding as a baseline, 
Meisch was able to show the development of ethanol 
as a reinforcer by comparing the rate of ethanol re- 
sponding during the first 2-hour component with the 
rate of water responding. Using this procedure it was 
possible to establish ethanol as an cflective reinforcer 
with five exposures to the drug selution. 

Griffiths and Thompson (1973, 1974) studied the 
effects of pentobarbital on the elimination of food- 
reinforced fixed-ratio responding by rats during ex- 
tinction. In a series of studies, it was demonstrated 
that the administration of immobilizing doses of 
pentobarbital on the first day of extinction markedly 
reduces the overall number of responses to extinction. 
In a number of control procedures it was shown that 
it did not matter whether a series of pentobarbital 
doses was administered which decreased abruptly or 
very gradually, but rather merely whether the rate 
of responding was markedly suppressed during the 
first part of the first session of extinction, Figure 11 
shows responses in extinction by animals treated with 
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Fig. 11. Effects of 20 mg/kg i.p. 
pentobarbital on responding 
during extinction following 
matched FR 20 food reinforce- 
ment history (rats). Pentobarbi- 
tal was administered at the 
arrow prior to extinction ses- 
sion, with no further drug ad- 
ministrations. Vertical bars at 
the right side of each curve in- 
dicate range of variability for 
each group (Griffiths & Thomp- 
son, 1973.) 
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pentobarbital prior to the beginning of a 5-hour ex- 
tinction session. As can be seen, there was a very sub- 
stantial difference in total responding to extinction as 
a function of a high immobilizing dose of pentobar- 
bital during the first portion of the extinction session. 

The foregoing studies indicate that it is possible to 
study phenomena falling within the rubric of “learn- 
ing” but expressed in terms of concepts understand- 
able within an operational experimental analysis of 
behavior. Transition states, such as acquisition and 
extinction or performance shift from one schedule to 
another, are subject to analysis within the framework 
of an experimental analysis of behavior. 


Motivational Factors 


Motivational variables have long been a major 
focus of many investigators in psychopharmacology. 
Psychiatrists and other practitioners working in 
psychopharmacology are interested in reducing anx- 
lety, aggression, altering sex drives, and so forth. 
Laboratory investigators had hoped to find drugs 
which would selectively alter hunger or thirst. Within 
the framework of an experimental analysis of the be- 
havioral actions of drugs, such phenomena are profit- 
ably approached by manipulating the type of conse- 
quences controlling behavior. Behavior reinforced by 
access to a target which can be attacked (Cherek, 
Thompson, & Heistad, 1973; Hutchinson, Azrin, & 
Hunt, 1968) provides a mechanism for studying the 
effects of drugs on aggressive behavior. Similarly, 
drugs’ effects on food reinforced responding, on re- 
sponding maintained by water, and responding to 
avoid painful shocks, are methods by which one can 
begin to understand the degree to which drugs have 
any selective effects on these motivational states. As 
indicated in our earlier discussion, considerable doubt 
has been cast upon the proposition that drug effects 
are primarily determined by the consequence con- 
trolling behavior, but are to a far greater degree de- 
pendent on the way in which those events are sched- 
uled to be presented or avoided. 


Sensation and Perception 


A third major category of investigations concerning 
drug effects pertains to the way in which drugs alter 
perception or sensation. Drugs such as the hallu- 
cinogens, depressants, and even to some degree stim- 
ulants, are said to alter an organism’s perception. An 
effort to understand the way in which drugs alter per- 
ception can be formulated, to a considerable degree, 
in terms of stimulus control. In an earlier section 
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stimulus control was discussed with reference to drug 
effects. As can be seen, the locus of analysis is not 
within the central nervous system or the mind, but is 
expressed in terms of covariation between certain ex- 
ternal stimulus events and systematic changes in be- 
havior. 


FUTURE OF BEHAVIORAL PHARMACOLOGY 


The future of behavioral pharmacology appears to 
lie in two primary directions. 


A Fine-Grained Laboratory Analysis of Behavioral 
Mechanisms of Drug Actions 


Weiss (1970) and Weiss and Gott (1972) have pro- 
vided a micro-analysis of drug effects on fixed-ratio 
performance and the temporal structure of behavior. 
Weiss and Gott found that amphetamine and imipra- 
mine shortened all inter-response times within a FR 
30 reinforcement schedule while pentobarbital length- 
ened them. The effects observed were related to what 
Weiss and Gott called “the relatively unitary charac- 
ter of fixed-ratio performance and its inherent co- 
hesiveness.” Plots of the incidence of inter-response 
time greater than one second suggest that ampheta- 
mine and imipramine alter fixed-ratio cohesiveness 
whereas pentobarbital enhances it (i.e, either the 
ratio performance consisted of extremely short and 
extremely long inter-response times, or was totally dis- 
rupted at very high doses). 


Applied Implications of Behavioral Pharmacology 


BEHAVIORAL TOXICOLOGY 


In 1969, Weiss and Laties wrote the first major re- 
view dealing with behavioral toxicology. They stated, 
“Many studies of the behavioral effects of drugs can 
be conceived of as attempts to determine selective 
toxicity in the context of a therapeutic aim. Be- 
havioral toxicology is the study of the selective toxic- 
ity as the direct aim.” (p. 320) A number of investiga- 
tions have attested to the sensitivity of operant pro- 
cedures in assessing behavioral actions of toxins. 
Armstrong, Leach, Belluscio, Maynard, Hodge, and 
Scott (1963) studied the effects of mercury vapor on 
performance on a multiple reinforcement schedule in 
pigeons, and found it was possible to produce re- 
versible changes in the behavioral baseline before 
they could detect any overt pathology or gross be- 
havioral disruption. Beard and Wertheim (1967) 
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studied the effects of carbon monoxide on temporal 
discrimination in human subjects. ‘They were able to 
detect the effect of relatively low concentrations of 
carbon monoxide (50 parts/million over a 90-min 
exposure). They also studied the effect of low concen- 
trations of carbon monoxide on responding during a 
spaced reinforcement schedule in rats. The amount of 
carbon monoxide in the atmosphere necessary to pro- 
duce a given change in performance was a function of 
the pause between the responses demanded of the 
schedule. That is, when a 30-sec pause was required, 
about 10 min of exposure to 100 parts/million was 
enough to diminish response rate more than two 
standard deviations below the control, With a 10-sec 
pause required, about 40 min of exposure was neces- 
sary to produce a comparable decrement in perfor- 
mance. 


Druc ABUSE 


Another area of behavioral pharmacology with sig- 
nificant implications for applied matters is drug abuse. 
Thompson and Schuster (1968), Schuster and Thomp- 
son (1969), and Thompson and Pickens (1969) have 
provided a conceptual framework within which to 
better understand human drug dependence. ‘That 
drugs serve as powerful reinforcers is now widely ac- 
cepted, and efforts to understand and modify human 
drug dependence are emerging based on an operant 
interpretation. Much of the published research has 
dealt with alcoholism although some workers have 
begun to deal with problems of modifying the use of 
opiate compounds as well. Mello and Mendelson 
(1970) have studied drinking patterns during work- 
contingent and non-contingent alcohol acquisition in 
human alcoholics. Bigelow and co-workers (1972) in- 
vestigated factors influencing alcohol consumption by 
chronic alcoholics. Bigelow, Gohen, Liebson, and Fail- 
lace (1972) studied establishment of controlled drink- 
ing by alcoholics. Volunteer chronic alcoholics were 
given access to substantial quantities of alcohol in 
situations where they earned the opportunity to par- 
ticipate in an enriched ward environment contingent 
on controlled drinking. The subjects overwhelmingly 
chose to drink moderately. In another study, Bigelow 
and Liebson (1972) examined response-cost factors 
controlling alcoholic drinking. Once again the sub- 
jects were given access to alcohol under experimental 
conditions. When a high ratio requirement was estab- 
lished for access to alcohol, alcohol drinking dropped 
to near zero. In another manipulation, when the num- 
ber of tokens required to purchase drinks was mark- 
edly increased, if the subject drank more than two 
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drinks per hour, a great reduction in alcohol con- 
sumption occurred. Pickens, Bigelow, and Griffiths 
(1973) studied alcohol consumption in a chronic 
alcoholic under controlled ward conditions. ‘They 
studied the effects of a time-out contingency and the 
establishment of stimulus control on drinking by a 
chronic alcoholic over a one-year period. Using these 
procedures, it was possible to reduce the amount of 
alcohol consumed from 1.5 ounces per minute to ap- 
proximately .2 to .3 ounces per minute. 

One of the more impressive applied outgrowths of 
an operant interpretation of drug dependence has 
been Hunt and Azrin’s (1972) work with chronic 
alcoholics in an outpatient setting, Using a combina- 
tion of positive reinforcement for non-drug related 
behavior and time-out from positive reinforcement 
contingent on drinking, they have been able to estab- 
lish extensive control over alcohol consumption by 
chrenic alcoholics. In addition, the patients with 
whom they worked maintained a high degree of em- 
ployment and maintained something approximating 
a normal home life following treatment. Liebson, 
Bigelow, and Flamer (1973) have used methadone as 
a reinforcer for consuming disulfuram, in combined 
alcoholics and heroin addicts. Disulfuram blocks 
alcohol metabolism and leads to vomiting following 
consumption of alcohol. In their technique, alcoholics 
were reinforced with meéthadoné contingent on ¢on- 
suming their disulfuram, Boudin (1972) has developed 
a large scale program based on a system of positive 
reinforcement for participating im treatment programs 
involving a variety of contingency management con- 
trol techniques for violation of the system. 


Pre-Cuinicau anp CLInicAu 
EXTENSIONS OF LARORATORY FINDINGS 


Behavioral pharmacology has now reached the 
point where it is meaningful to approach some ques- 
tions concerning clinical therapeutic effects of be- 
haviorally active drugs. Physicians have often used 
behaviorally active drugs based on the assumption 
that a drug functions independently of the environ- 
mental conditions under which the drug is admin- 
istered. 

In recent years interest has grown in the use of 
operant techniques in conjunction with drug therapy 
in various applied settings. Lindsley (1962) first 
demonstrated the applicability of operant condition- 
ing techniques in the measurement of drug-behavior 
interactions in an applied human setting. Subse- 
quently Hollis (1968) demonstrated a technique for 
measurement of differential behavioral effects of 
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Fig. 12. Effects of chlorpromazine on rate of ball-manipulanda 
responding under fixed-ratio 25 schedule of positive reinforce- 
ment. Curves represent standard deviation units of change in 
response rate from control to drug sessions for subjects DG 
and KL. The dashed line at 0 units indicates the mean control 
level. (From Hollis & St. Omer, 1972.) 


chlorpromazine in human retardates. More recently 
Hollis and St. Omer (1972) studied the effects of 
chlorpromazine (0.5-3.0 mg/kg) on operant respond- 
ing by retardates in a controlled experimental situa- 
tion. Figure 12 shows the effect of chlorpromazine on 
the rate of FR 25 responding for M & M reinforce- 
ment by two retarded subjects. The response rates are 
expressed as deviations in standard scores from the 
baseline rates. The data demonstrate a direct relation 
between the dose of chlorpromazine administered and 
the amount of response rate suppression of fixed-ratio 
performance. In another manipulation, Hollis and St. 
Omer studied the effects of chlorpromazine on re- 
sponse rates and response latency using a FR 400 
schedule of positive reinforcement in two additional 
subjects. Figure 13 shows dose response curves of FR 
100 performance across the dosage range of .25 to 1.0 
mg/kg. As can be seen, once again there is an orderly 
relation between dose of chlorpromazine administered 
and suppression of operant responding, a finding com- 
parable to those obtained with infrahuman subjects. 

Strong, Sulzbacher, and Kirkpatrick (1973) studied 
the effects of diphenhydramine on facial grimacing in 
a classroom by a five-year-old boy, deficient in lan- 
guage and learning ability. Figure 14 shows the rela- 
tionship between several manipulations of conse- 
quences following occurrence of grimacing in Experi- 
ment 1. In Experiment 2, data are presented from a 
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baseline placebo and 12.5 and 25 mg doses of diphen- 
hydramine. Both response-contingent consequences 
(loss of candy and self-recording) suppressed grimac- 
ing to near zero rate per minute, whereas diphen- 
hydramine, which also had a rate reducing effect, did 
so to a considerably lesser degree. In another experi- 
ment, Sulzbacher (1972) studied the effects of d-am- 
phetamine and methylphenidate on classroom be- 
havior of children with learning difficulties. Figure 15 
shows the effects of placebo, 5, and 10 mg of d- 
amphetamine on talking out in class and out-of-seat 
behavior during class. Figure 16 shows the effect of 
d-amphetamine on academic performance. Ampheta- 
mine had a marked rate-reducing effect on the inap- 
propriate behavior, while producing a slight disrup- 
tive effect On academic responding. Methylphenidate, 
on the other hand, had a much less striking effect on 
talking out in class, a slight rate increasing effect on 
writing performance in the classroom, but little or no 
effect on arithmetic and reading behavior. 

Recently attention has been given to the interac- 
tion between reinforcement contingencies and drug 
treatment in purely applied settings, such as in state 
mental hospitals. Paul, Tobias, and Holly (1972) 
studied the effects of a variety of behaviorally active 
drugs on psychotic behavior of chronic mental hos- 
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the mean control level. (From Hollis & St. Omer, 1972.) 
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Fig. 14. Rate per minute of facial grimaces. Experiment 1 shows 
the effects of three different reinforcement contingencies. Ex- 
periment 2 compares the effects of two dosage levels of diphen- 
hydramine with baseline (no drug) and placebo conditions. 
These data on drug cflect were gathered about 2 hours after the 
medication was administered. (From Strong et al., 1974.) 


pital patients under conditions of milieu therapy and 
behavior modification in a state mental hospital. Al- 
though behavioral contingencies seem to have a sig- 
nificant effect in improving the behavior of psychotic 
patients, none of the drug therapies in any of the dos- 
ages administered had any statistically significant 
effect in improving behavior. McConahey and 
Thompson (1972) and McConahey (1973) studied the 
effect of chlorpromazine on the behavior of twenty- 
two moderately to severely retarded women in a state 
hospital. A multiple schedule was used in which token 
reinforcement was presented contingent upon adap- 
tive behavior during a portion of the day, while dur- 
ing a comparable period of the day no programmed 
consequences were arranged. During alternate 28-day 
periods the residents received, in a randomly assigned 
order, either chlorpromazine or an identically appear- 
ing placebo. No overall significant differences were 
found between drug and placebo treatments in any of 
25 behaviors recorded on all 22 patients over time. On 
the other hand, very large statistically significant 
differences were obtained on 22 of the 25 behavioral 
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Fig. 15. Effects of d-aiiphetaimine on Ralph's classroom be- 
havior. The heavy lines connect the means of each condition. 
The light bars around each mean indicate the range of daily 
rates. (From Sulzbacher, 1972.\ 
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Fig. 16. Effects of d-amphetamine on the academic performance 
of Ralph. (From Sulzbacher, 1972.) 
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measures across all residents, comparing the reinforce- 
ment periods and the non-reinforcement periods of 
the multiple schedule. Figure 17 shows the total fre- 
quency of observations during which residents were 
sitting at a table working constructively at a learning 
task during periods when they were being reinforced 
(contingencies) as opposed to periods when they were 
not being reinforced (no contingencies), while being 
treated with placebo or chlorpromazine. There were 
no apparent differences due to the chlorpromazine 
and placebo treatments in the tendency to work con- 
structively at a task, while there were very large differ- 
ences between periods in which patients were rein- 
forced with tokens as compared with those when they 
were not. Figure 18 shows similar data with the total 
frequency of residents raising their voices, a behavior 
frequently occurring preceding physical aggression. 
As can be seen there were no appreciable differences 
between the frequency of raising the voice during 
placebo and chlorpromazine treatments either with 
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Fig. 17. ‘Total frequency of instances of constructively working 
at a task (seated) by 22 retarded women during daily 45 minute 
periods under four treatment conditions: Token reinforcement 
(contingencies) plus placebo; token reinforcement plus chlorpro- 
mazine (CPZ); no programmed contingencies plus placebo; no 
programmed contingencies plus chlorpromazine. (From Mc- 
Conahey, 1972.) 


OPERANT BEHAVIORAL PHARMACOLOGY 


200 PLACEBO ———————_>||__ CPZ s 


150 


140 


FREQUENCY OF RAISING VOICE 


aol inn 
+ CONTINGENCIES 


+ NO CONTINGENCIES LJ 


Fig. 18. Total frequency of instances of raising voice (a_be- 
havior frequently preceding physical aggressive acts) under 
the same four treatment conditions as indicated in Figure 17. 
(From McConahey 1972.) 


or without reinforcement. However, when reinforce: 
ment occurred, there was an enormous reduction in the 
amount of raising voices both during placebo and 
chlorpromazine treatments. 

Data from the foregoing studies suggest that it is 
possible to analyze some of the interactions between 
reinforcement conditions and drug-treatment condi- 
tions in a variety of applied settings involving dis- 
turbed children, children with learning difficulties, re- 
tardates and hospitalized psychiatric patients. The 
introduction of behaviorally active medication in a 
variety of applied settings must be predicated upon 
more careful attention to the interactions between 
drug treatment and the prevailing environmental 
contingencies. 
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Central Reinforcement 


INTRODUCTION 


It was nearly 60 years after the brain was first 
stimulated electrically that it was shown that motiva- 
tional effects could be elicited by such stimulation. 
This long delay following the historic experiments of 
Fritsch and Hitzig, who stimulated the motor cortex 
in 1870, was due in part to the facts that most of the 
experiments during this period were in anesthetized 
animals and that most investigators stimulated the 
cerebral cortex, which is motivationally neutral (Doty, 
1969). In the early 1930s Hess performed important 
pioneering experiments in unanesthetized, freely mov- 
ing animals. Although he observed that a variety of 
motivated behaviors could be elicited by stimulation 
of the hypothalamus and other subcortical structures, 
more than two decades passed before further develop- 


*'The authors wish to express their appreciation to Miss 
Anne Baxter and Mrs. Marianne Jeffery for typing the several 
drafts of the manuscript and to D. Baran, J. P. Huston, W. J. 
McClelland, P. M. Milner, A. G. Phillips, Ann Robertson, P. 
Russell, B. B. Schiff, and T. B. Wishart who read and commented 
on earlier versions of this article. The authors’ research is sup- 
ported by grants from the National Research Council of Canada 
and the Medical Research Council of Canada. 


a bridge between 
brain function and behavior’ 


570 


Gordon Mogenson 
and 
Jan Cioe 


ments occurred. The demonstration of central rein- 
forcement, made possible by the use of stereotaxic 
surgical procedures for the implantation of chronic 
stimulating electrodes and by the use of operant tech- 
niques, provided the major impetus for the study of 
brain mechanisms of reinforcement and more gen- 
erally for the study of brain-behavior relationships. 
Positive central reinforcement was first reported by 
J. Olds and Milner (1954), who observed that rats re- 
ceiving electrical stimulation of the septum returned 
to the place in an open field where they had been 
stimulated (Figure 1). When the animals could initiate 
the brain stimulation by pressing a lever they made 
this operant response at high rates for long periods of 
time. Central reinforcement, inferred from the brain 
self-stimulation phenomenon, immediately excited the 
interest and imagination of psychologists and other in- 
vestigators. During the next few years central rein- 
forcement was demonstrated with electrical stimula- 
tion of a number of subcortical structures in a variety 
of species, including man. Although the possible prac- 
tical application of central reinforcement was recog- 
nized early (in 1955 McClelland suggested, in a hu- 
morous yet prophetic vein, that brain self-stimulation 
might be made readily available to perk us up, as an 
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Fig. 1. This picture illustrates the discovery of central rein- 
forcement by James Olds and Peter Milner. (From Hebb, 1972.) 


“exotic coffee break’’) and although the possibilities 
for its use in the alleviation of abnormal behaviors 
clearly exist (Crichton, 1972), the major developments 
have been experimental and theoretical. Psychologists 
saw the relevance of central reinforcement to many of 
the issues and problems of the psychology of learning, 
and they were intrigued by the possibility that self- 
stimulation would permit the direct study of the 
neural substrates of reinforcement and lead eventually 
to an understanding of the basic mechanisms of moti- 
vation and learning. They were interested in compar- 
ing central reinforcement with conventional rein- 
forcers to determine whether the self-stimulation 
phenomenon was interpretable in terms of drive re- 
duction theory, the most popular behavioral theory at 
the time. The discovery of self-stimulation and central 
reinforcement reactivated a number of issues in the 
areas of motivation, reinforcement, and learning, and 
as we indicate later, the work that followed led to 
new conceptions of reinforcement. 
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Central reinforcement has also stimulated other 
areas of research. The self-stimulation procedure has 
been widely used in studying the effects of drugs on 
behavior, and research using central reinforcement has 
had a strong impact on the new disciplines of neuro- 
pharmacology and psychopharmacology. A number of 
studies have indicated that catecholaminergic neurons 
subserve brain self-stimulation. Self-stimulation is 
markedly reduced by drugs which inhibit the synthesis 
of catecholamines, deplete catecholamines from nerve 
terminals, or destroy catecholaminergic neurons, and 
is increased by drugs, such as amphetamine, which in- 
crease the synaptic release and prevent the reuptake 
of catecholamines (for a review see German & Bow- 
den, 1974). Several years ago Stein postulated the 
noradrenergic hypothesis of reward and subsequently 
demonstrated that noradrenalin was released during 
self-stimulation, apparently due to the activation of 
noradrenergic fibers (Stein & Wise, 1969), Stein, Wise, 
and Berger (1972) have proposed that schizophrenia is 
the result of a biochemical disturbance (a deficit in 
the enzyme dopamine-g-hydroxylase) in which nor- 
adrenerpic “reward” fibers aré destroyed by the pro- 
duction of 6-hydrexydepamine, a neurotoxin. 

The first successful conditioning of visceral re- 
sponses was obtained with central reinforcement and 
openéd the field of biofeedback. Central reinferce- 
ment enabled experimenters to deliver a reinforcement 
for the desired visceral response even though the 
animal was paralyzed and so was unable te approach 
or consume conventional reinforcers. It is necessary to 
induce paralysis in order to avoid the possibility that 
the visceral response is merely an artifact of the con- 
ditioning of skeletal movements. Di Cara and Miller 
(1968) demonstrated that blood pressure, blood flow, 
intestinal contractions, and other visceral responses 
were conditioned using stimulation of reinforcing 
sites in the hypothalamus. This field, although contro- 
versial, is an active area of investigation which may 
contribute eventually to an understanding and treat- 
ment of psychosomatic disorders. 

Finally it should be mentioned that central rein- 
forcement has stimulated research at the interface be- 
tween psychology and ethology. This resulted from 
the observations that feeding (Hoebel & Teitelbaum, 
1962; Margules & Olds, 1962), drinking (Mogenson & 
Stevenson, 1966), object carrying (Phillips, Cox, 
Kakolewski, & Valenstein, 1969), and other species- 
typical behaviors are elicited from the same brain sites 
as self-stimulation. Not only has this led to an ingen- 
ious theory of reinforcement (Glickman & Schiff, 1967) 
and vigorous investigation of self-stimulation _be- 
havior, but it has also helped to promote the discourse 
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between psychologists and ethologists, a development 
which was long overdue. 


METHODOLOGICAL CONSIDERATIONS 


Following the initial demonstration of central rein- 
forcement there was considerable research to delineate 
the regions of the brain subserving the phenomenon 
(e.g., J. Olds, 1956; M. E. Olds & J. Olds, 1963; Wet- 
zel, 1968). Although the hypothalamus has been the 
most popular target, central reinforcing effects are also 
obtained from many other neural structures, through- 
out the limbic system (Milner, 1970) and the extra- 
pyramidal motor system (e.g., Routtenberg & Mals- 
bury, 1969). Appendix A indicates the diversity of 
brain structures shown to be at least mildly reinforc- 
ing and also illustrates the ubiquity of the phenom- 
enon across species. 


Experimental Procedures 


In order to obtain central reinforcement it is neces- 
sary, of course, to place a stimulating electrode in or 
near one of the brain structures referred to in Ap- 
pendix A. The operative procedure involved is rela- 
tively simple and straightforward, especially if the 
subject is one of the more hardy laboratory animals 
(e.g., rat or gerbil). The animal is first anesthetized 
and the top of the head shaved. After the head is 
positioned in a stereotaxic instrument an incision is 
made along the midline to expose the skull. Bone 
sutures are identified as a guide in directing the elec- 
trode to subcortical structures. A standard brain atlas 
(e.g., Pellegrino & Cushman, 1967) is used with refer- 
ence to skull bone sutures in order to obtain three- 
dimensional coordinates for placement of the elec- 
trode in the desired structure. A hole is then drilled 
through the skull and the electrode is lowered to the 
desired depth by the micromanipulator of the stereo- 
taxic carriage. The electrode is securely attached to 
the skull and anchoring screws placed around the 
electrode hole by acrylic cement. About one week is 
usually allowed for the animal to recover from the 
trauma of surgery before testing is begun; such a 
preparation can continue in use for many months. 

Electrical stimulation provided by a stepdown 
transformer from a 60-Hz ac source is very effective 
for self-stimulation and was used extensively for 
several years. Commercial electronic stimulators have 
been used to deliver rectangular pulses which may 
be varied in pulse duration, frequency, and wave- 
form as well as in intensity. The duration of a train 
of reinforcing stimulation is usually limited to less 


CENTRAL REINFORCEMENT 


than 1 sec (.2 and .5 sec being the most popular), al- 
though in certain situations an operant response may 
be reinforced by several trains (e.g., Hawkins & Plis- 
koff, 1964). The details of the delivery system for the 
brain stimulation are dependent on the particular 
experimental situation, although typically the brain 
stimulation is delivered through wire leads from a 
commutator or other device which allows reasonable 
freedom of movement. When the desired response has 
occurred the central reinforcement is delivered either 
automatically or manually. 

Central reinforcement has been demonstrated in 
several experimental situations (mazes, obstruction 
boxes, runways, shuttleboxes), but it has been studied 
almost exclusively with operant methodology. With 
operant techniques a high degree of environmental 
control is possible, and since the operant response 
causes little or no change in the environment the 
behavior is very stable (Honig, 1966, p. 4). When the 
operant response is maintained with central reinforce- 
ment the motivational state of the animal which re- 
sults from the brain stimulation is relatively constant 
—unaltered by satiation effects, for example—so that 
the operant responding is stable for long periods of 
time. 


Operant Rate as a Measure of 


Central Reinforcement 


The Skinner box has been the most popular test for 
self-stimulation, and as a result rate of response has 
been widely adopted as the measure of central rein- 
forcement. Response rate is not, however, always a re- 
liable indicator of reinforcement strength. Hodos and 
Valenstein (1962) showed in a two-lever preference 
test (hypothalamic versus septal stimulation at various 
current intensities) that animals did not necessarily 
select the site of stimulation and current intensity that 
maintained the highest response rates. For example, 
rats preferred septal stimulation at a moderate current 
intensity to hypothalamic stimulation even though 
the hypothalamic stimulation maintained a higher re- 
sponse rate. Similarly, Davis, Davison, and Webster 
(1972) demonstrated in pigeons performing on concur- 
rent variable interval (VI) schedules that the most 
highly preferred brain stimulation maintained low 
rates of responding. There is also evidence (Valenstein 
& Beer, 1962) that reinforcement strength as deter- 
mined by competition of central reinforcement with 
other reinforcers such as food and shock avoidance is 
not the same as that indicated by response rate. 
Finally, Hawkins and Pliskoff (1964) have made the 
same point by using a two-member behavioral chain- 
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ing procedure. In this procedure responding on a VI 
schedule on a single lever is reinforced by access to a 
second retractable lever which delivers central rein- 
forcement on a continuous reinforcement (CRF) 
schedule. ‘The rate of pressing on the first lever con- 
tinued to increase at current intensities higher than 
those which maintained peak response rate on the 
second lever, thus indicating that the potency of cen- 
tral reinforcement cannot be assessed adequately by 
self-stimulation rates on a CRF schedule. 

Valenstein (1964) has argued on the basis of these 
results that the rate of responding on a CRF schedule 
is a poor index of reinforcement strength. He has 
gone on to point out that there are logical difficulties 
in using an average CRF rate since there is the 1m- 
plicit. assumption that reinforcement strength its 
homogeneous throughout the test session. Occa- 
sionally, this assumption is not justified with central 
reinforcement since some effects of the stimulation 
which persist may change the strength of the reim- 
forcer after its administration (e.¢., seizure activity— 
Mogenson, 1965; Newman & Feldman, 1964). Further- 
more, central reinforcement, especially at higher cur- 
rent intensities, may elicit various respondents in an 
uncenditioned manner (c¢.g., forced meter responses), 
which can disrupt the response rate even though these 
high current intensities are preferred over lower in- 
tensities which maintain higher response rates (Hodos 
& Valenstein, 1962), 

Although the criticisms of Valenstein against a 
CRF rate of response as an index of central reinforce- 
ment appear justified in some situations, such a 
measure is meaningful in many others. Clearly, if a 
comparison is being made between diverse sites of 
stimulation, or if high intensities are employed, there 
is a real problem of distortion with a CRF response 
rate measure. It seems equally true, however, that in 
the majority of experiments there is little distortion 
as long as moderate current intensities are employed, 
as even Valenstein’s data indicate (Hodos & Valenstein, 
1962). ‘The use of moderate intensities decreases the 
probability of both interfering motor responses and 
seizure activity. This situation is indeed fortunate 
since most of the work discussed in the comparison of 
central and conventional reinforcers (discussed in a 
later section) is based on the CRF response rate 
measure. 


1It might be useful at this point to establish two conven- 
tions: (1) the species should be understood to be rats unless 
otherwise indicated; and (2) the method used in a particular 
study involves a conventional operant response (such as lever 
pressing) on a CRF schedule unless a different method is 
specified. The use of these conventions will help to simplify the 
task of describing the many studies of self-stimulation. 
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The Advantages of Intermittent Schedules 


It should be noted that many of the criticisms of 
the use of response rate as a measure of central rein- 
forcement apply mainly when a CRF schedule is used. 
Intermittent schedules, which are preferable to the 
CRF schedule in most operant situations, produce a 
more stable pattern of behavior while still maintain- 
ing a sensitivity to environmental manipulations 
(Reynolds, 1968). Accordingly, intermittent schedules 
may be preferable for a behavioral analysis of self- 
stimulation. When the trains of pulses are spaced, the 
disruptive side effects of brain stimulation, such as 
forced motor responses and carry-over effects, are 
minimized. Using a two-lever chaining procedure, 
Hawkins and Pliskolf (1964) clearly showed that the 
response rate on the intermittently reinforced lever 
paralleled the results obtained with preference tests 
when current was manipulated. Moreover, by using 
intermittent schédules oné retains the advantagés of 
using a rate measure as the dependent variable (e.g, 
Honig, 1966. pp. 6-7; Skinner, 1966b. pp. 15-17). 
There has been, however, some problem in maintain: 
ing animals on a “reinforcement-poor” intermittent 
schedule using central reinforcement in a single-lever 
situation; the problem (as is discussed more fully in a 
later section) appears at least partially to be one of 
determining how much central reinforcement is 
equivalent to the standard amounts of food and water 
reinforcement commonly used. The advantages of an 
intermittent schedule also accrue to the use of concur- 
rent schedules, which appear to be highly sensitive to 
changes in the magnitude of reinforcement, as op- 
posed to single schedules, which are less so (Hollard & 
Davison, 1971). Such sensitivity may be particularly 
useful in analyzing the incentive properties of brain 
stimulation. 

Another modification of intermittent schedules 
which has been successfully employed with central 
reinforcement is the progressive-ratio procedure de- 
scribed originally by Hodos (1961) for food reinforce- 
ment. In this procedure the animal is required to emit 
a progressively increasing number of responses in 
order to obtain each successive reinforcement. “Che 
“terminal ratio” is defined as the highest number of 
responses emitted before a pause of given duration, 
such as 15 sec, occurs. This is used as the dependent 
measure. Hodos (1965) demonstrated that the size of 
the terminal ratio is sensitive to long durations of re- 
warding brain stimulation. Keesey and Goldstein 
(1968) further modified this procedure by defining the 
“terminal ratio” as the highest level of stable ratio 
(FR) responding that a given reinforcement condition 
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would maintain, and they were able to demonstrate 
that the functions relating this terminal ratio to a 
wide range of stimulus currents were monotonic at all 
brain stimulation sites tested. It appears that these 
more sophisticated behavioral techniques are poten- 
tially useful for the study of central reinforcement 
and for the analysis of the reinforcement process in 
general. 

Although there are certain limitations in the use of 
a CRF schedule with central reinforcement, a CRF 
rate measure is still meaningful in most situations. 
When a comparison is being made between stimu- 
lation sites or when high current intensities are used 
there is the possibility of distortion of a CRF response 
rate measure. In most experiments there is little dis- 
tortion, however, especially if moderate current in- 
tensities are employed. 

Stein and Ray (1959) devised a CRF procedure 
which allows the animal to self-regulate the current 
intensity. Pressing one lever increases the current, 
whereas pressing a second lever decreases it. Inter- 
estingly, it was reported that high current levels which 
often produced disruptive motoric responses were 
selected. Modifications of this ‘“‘titration’’ method have 
been used extensively in the analysis of drug effects on 
self-stimulation behavior (e.g., Stein, 1962). 


Nonrate Measures of Central Reinforcement 


As we have already mentioned, central reinforce- 
ment has been obtained in other experimental situa- 
tions which do not utilize a rate measure. These pro- 
cedures, however, also have drawbacks which have 
limited their usefulness. Hodos and Valenstein (1962) 
used a preference procedure in which animals choose 
between two conditions available on different levers, 
but this becomes extremely cumbersome when a large 
-humber of comparisons are to be made. This pro- 
cedure, nonetheless, is quite useful in validating other 
procedures. The obstruction box technique employed 
by J. Olds (1960) is generally not as useful due to the 
inherent problems involved in repeated aversive foot 
shock as well as the apparent analgesic properties of 
some brain stimulation (Yunger, Harvey, & Lorens, 
1973). A technique which has gained much more 
widespread use than either of those already mentioned 
was developed by Valenstein and Meyers (1964) and 
employs a shuttlebox. They have suggested that a 
shuttlebox situation, in which central reinforcement 
is continuously delivered as long as the animal re- 
mains on one side but not on the other, provides a 
measure of reinforcement value which minimizes the 
influence of activity level and performance capabil- 
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ities. ‘Che measure used is the percentage of time 
spent receiving stimulation. Since the positive (brain 
stimulation) and neutral sides can be randomly inter- 
changed, this measure is directly amenable to statis- 
tical evaluation. Some animals, however, self-admin- 
ister very brief trains of central reinforcement and 
may spend less than 50% of the session (chance level) 
on the positive side, although these same animals re- 
peatedly cross over and receive the stimulation (Cioé, 
personal observation). Independent measures of rein- 
forcement strength indicate that stimulation will 
maintain a lever-press response as well as the operant 
of crossing in the shuttlebox. This effect, however, is 
not reflected in the time score, which limits the useful- 
ness of this measure. 


CENTRAL REINFORCEMENT COMPARED 
TO CONVENTIONAL REINFORCEMENT 


In the study of central reinforcement, operant 
methods are both important and useful, as shown in 
the previous section; indeed, the area would probably 
not have undergone such rapid and fruitful expansion 
without operant techniques. In this section we would 
like to turn the discussion around and demonstrate 
the contribution that central reinforcement can make 
to an analysis of operant behavior. The main issue is 
this: does self-stimulation differ in any substantial and 
irreducible way from the more conventional rein- 
forcers used before its discovery (e.g., food and water 
to a deprived animal, saccharine, etc.)? ‘Che initial 
view of this phenomenon was that the neural sub- 
strate of reinforcement had been discovered, and so 
it was argued that conventional and central reinforce- 
ment were essentially identical in nature (J. Olds, 
1956). With further research, however, there emerged 
a growing number of apparent dissimilarities, and 
elaborate theories were developed to account for these 
differences. More recently, the similarities of conven- 
tional and central reinforcement have been stressed 
and the so-called differences attributed to procedural 
differences (e.g., Trowill, Panksepp, & Gandelman, 
1969) and to the more rapid decay of stimuli that con- 
trol central reinforcement (Lenzer, 1972). Conven- 
tional and central reinforcement will be compared in 
this section under the following headings: acquisition, 
extinction, secondary reinforcement, priming, inten- 
sity and persistence, partial reinforcement, and _ in- 
fluence of drive state on self-stimulation. In each case, 
we shall attempt to show that when appropriate com- 
parisons are made and equivalent conditions estab- 
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lished, the two kinds of reinforcers do not differ sub- 
stantially. 


Acquisition 


In the pioneering study of J. Olds and Milner 
(1954) it appeared that the rate of acquisition of the 
operant response for central reinforcement was un- 
usually rapid compared to acquisition using conven- 
tional reinforcers. Subsequently this observation was 
confirmed by other investigators experienced with 
shaping animals using conventional reinforcers. A 
comparison of central and conventional reinforcers is 
not entirely justified, however, since the temporo- 
spatial relations between response and reinforcement 
are not the same (Gibson, Reid, Sakai, & Porter, 1965). 
For example, when water is used as the reinforcer of 
an operant response the animal must depress the 
lever, move to the dipper, and then drink: there is 
usually a spatial separation between the lever and the 
water which results in a delay of the consummatory 
act. With central reinforcement, however, there is no 
such delay in reinforcement. Gibson et al. attempted 
to equate the two situations by having animals that 
were receiving. central reinforcement depress a lever 
which introduced a dry dipper that had to be con- 
tacted to trigger the stimulation, a situation compar- 
able to the behavioral chaining procedure of Hawkins 
and Pliskoff (1964); either sugar water or brain stim- 
ulation was delivered immediately when the animal 
touched the dipper, Panksepp and Trowill (1967a) 
made the two situations even more similar by deliver- 
ing chocolate milk via an intraoral fistula immediately 
after a lever press. It was found in both studies that 
when the conventional reinforcement was immediate, 
acquisition was as rapid as that found with central 
reinforcement. ‘This finding is not surprising given 
the well-established importance of delay of reinforce- 
ment in other situations (Renner, 1964). 

It is not clear what role temporospatial factors play 
in other test situations. For example, for the acquisi- 
tion of a discrimination task there are conflicting 
reports as to whether or not there are significant differ- 
ences in acquisition rate between central and conven- 
tional reinforcers. Kling and his associates (Kling & 
Berkley, 1968; Kling & Matsumiya, 1962; Terman & 
Kling, 1968) have failed to find significant differences. 
Sadowsky (1969), however, has demonstrated a faster 
rate of acquisition of a multiple schedule (i.e., an 
operant discrimination) with central reinforcement as 
compared to food pellets. None of these studies at- 
tempted to control for temporospatial differences be- 
tween central and conventional reinforcers. Linholm 
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and Keesey (1970), however, tried to control for these 
differences by imposing an arbitrary 1-sec delay of cen- 
tral reinforcement following performance of the oper- 
ant (breaking a photobeam over the food cup). De- 
spite such a control the acquisition rate for central 
reinforcement (both hypothalamic and septal stimula- 
tion) was superior as compared with sweetened con- 
densed milk. ‘The study therefore suggests a difference 
between central and conventional reinforcers—at least 
for a discrimination task—not attributable to differ- 
ences in immediacy of delivery. Linholm and Keesey 
suggest, rather speculatively, that this effect may be 
related to differences in the duration of central and 
conventional reinforcers. Stimuli associated with in- 
gestion occur following the presentation of food 
reinforcement, and as the food passes through the oral- 
esophygeal cavities the possibility of conditioning be- 
haviors incompatible with the discriminative response 
increases, since these stimuli are present for a rela- 
tively long period of time. Central reinforcement, in 
contrast, usually involves a relatively short duration 
of stimulation with more distinctive onset-offset char- 
acteristics. Such an interpretation is not inconsistent 
with a view that stresses the similarities of central and 
conventional reinfercers; central reinforcement is a 
case in which the experimenter has greater control 
than usual over the duration of the stimuli associated 
with reinforcement, and appropriate manipulations 
would result in greater similarity in the effects of cen- 
tral and food reinforcement (€.8., Linholm & Keesey, 
1968). 


Extinction 


Extinction occurs very rapidly when the operant 
response is reinforced by brain stimulation (Culber- 
ton, Kling, & Berkley, 1966; Deutsch & Howarth, 1963; 
J. Olds & Milner, 1954; Seward, Uyeda, & Olds, 1959), 
suggesting a second fundamental difference between 
central and conventional reinforcers (Figure 2). How- 
ever, the earlier differences may have been due to 
difference in procedure; with conventional reinforcers 
the animal is usually food- or water-deprived, and 
there is a temporal delay between operant response 
and reinforcement; whereas with central reinforce- 
ment the animal is typically not deprived, and rein- 
forcement is presented immediately. 

Extinction occurs less rapidly with central rein- 
forcement when animals are food-deprived (Deutsch 
& DiCara, 1967). Furthermore, extinction with con- 
ventional reinforcement occurs more rapidly when 
animals are not deprived (Panksepp & ‘Trowill, 
1967b). When central reinforcement and sugar water 
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Fig. 2. When an operant response is maintained by central 
reinforcement, extinction occurs very rapidly as shown when 
the stimulation voltage is reduced to zero. (From Olds & Milner, 
1954. © 1954 by the American Psychological Association. Re- 
printed by permission.) 


were used with comparable testing conditions, extinc- 
tion curves were similar (Gibson et al., 1965). It may 
be concluded that extinction with central reinforce- 
ment 1s comparable to extinction with a high-incen- 
tive conventional reinforcer presented with minimal 


delay to an animal sated or following a short depriva- - 


tion period (Trowill, Panksepp, & Gandelman, 1969). 


Secondary Reinforcement 


The deprivation state of the animal may also be 
the variable which accounts for the mixed success of 
attempts to demonstrate secondary reinforcement us- 
ing brain stimulation. Seward, Uyeda, and Olds 
(1959), Keys (1964), and Mogenson (1965) were all un- 
able to establish secondary reinforcement, but these 
studies were conducted with nondeprived animals. 
Stein (1958), using a classical conditioning procedure, 
paired a tone with hypothalamic or septal stimulation 
and demonstrated that the tone possessed reinforcing 
properties due to its association with central stimula- 
tion. Similarly, Knott and Clayton (1966) were able to 
confirm Stein’s results in addition to demonstrating 
that partial reinforcement (i.e., intermittent pairing 
of tone and central reinforcement) produces a more 
durable secondary reinforcement than does contin- 
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uous reinforcement. Unfortunately, however, neither 
of these reports specifies the feeding schedule em- 
ployed. The importance of this variable is indicated 
by DiCara (1966) as well as DiCara and Deutsch 
(1966). It was found that secondary reinforcement 
could be obtained using electrical stimulation of a 
brain area sensitive to the level of food deprivation 
(see J. Olds, 1958a) if the animals were in fact food- 
deprived; if food deprivation did not affect the rate of 
self-stimulation, no secondary reinforcement was ob- 
tained. It appears, therefore, that another apparent 
difference between central and conventional rein- 
forcers is related to the level of deprivation rather 
than to the characteristics of the reinforcers. 


Priming 


J. Olds and Milner (1954) reported that following 
the rapid extinction to central reinforcement it was 
necessary to deliver noncontingent brain stimulation 
“to show that the current was turned on again” (p. 
425) before lever pressing resumed. Furthermore, it 
has been reported that it is also necessary to deliver a 
few of these “free” stimulations at the start of a session 
to induce the animal to self-stimulate even though no 
formal extinction procedure has been introduced. 
This anomaly of central reinforcement has been em- 
phasized by Deutsch and Howarth (1963); they have 
suggested that self-stimulation involves the activation 
of a drive system as well as a reinforcement system. 
Priming (i.e., delivery of ‘‘free” stimulation), it is sug- 
gested, induces the proper drive state so as to motivate 
the animal to lever-press; without the induction of 
this drive state the animal fails to start self-stimulat- 
ing. Each stimulation, therefore, sets up the appro- 
priate state in the animal so that the following 
stimulations are reinforcing and thereby responding is 
maintained. 

Although there is a certain appeal to this view, 
which has led to some interesting experiments by 
Deutsch and his colleagues, it has been reported that 
many animals do not require priming to initiate self- 
stimulation (Trowill, Panksepp, & Gandelman, 1969). 
Furthermore, animals can be trained to continue 
lever-pressing after periods of nonreinforcement (Gan- 
delman, Panksepp, & ‘Trowill, 1968; Pliskoff, Wright, 
& Hawkins, 1965), indicating that central reinforce- 
ment “is not totally dependent on the time since the 
last stimulation” (Trowill et al., 1969, p. 291). 


Intensity and Persistence 


One of the more obvious features of brain self-stim- 
ulation is the astonishing vigor of the animal’s be- 


Gordon Mogenson and Jan Cioé 


havior, which is frequently reflected in very high rates 
of responding. For example, Ray, Hine, and Bivens 
(1968) reported an average rate of 130 lever presses per 
min on a CRF schedule. It should be noted, of course, 
that part of the reason for such high rates with central 
reinforcement is that so little time is needed to “con- 
sume” the reinforcement—certainly much less time 
than is required to chew pellets or drink water. ‘This 
leads to an artifact in the comparison of response rates 
based on a CRF schedule and in fact is one reason 
why it is more desirable to use an intermittent sched- 
ule for such comparisons. The differences in the time 
required to “‘consume” the reinforcement will only 
minimally affect such schedules. 

Not only do animals emit relatively high response 
rates, but they continue to do so for relatively long 
periods of time. J. Olds (1958a) recorded 35 lever 
presses per min for 26 hr until the animal was ex- 
hausted and went to sleep (Figure 3), and Valenstein 
and Beer (1964) an average rate of 30 responses per 
min for a period of 20 days. Even telencephalic stim- 
ulation, which typically satiates more quickly than 
diencephalic stimulation, doés so only after 4 to 8 hr 
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Fig. 3. Operant responding continues for very long periods of 
time when central reinforcement is used. This rat pressed the 
lever for more than 24 hr before going to sleep. (From Olds, 
1958b. © 1958 by the American Psychological Association. Re- 
printed by permission.) 
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Fig 4. Water-deprived rats press lever for electrical stimulation 
of the hypothalamus in preference to lever-pressing for water 
as shown for current levels III, 1V, and V, When the intensity 
of brain stimulation is low they show A prefetence for water. 


(Based on Morgan & Mogenson, 1966.) 


(J. Olds, 1958b). It is this characteristic—relative in- 
satiability—of central reinforcement which is its most 
salient feature as compared to conventional rein- 
fercers and which makes central reinforcement such a 
useful laboratory procedure. 

The potency of central reinforcement and the de- 
eree to which it may be used to control an animal's 
behavior is most clearly demonstrated in test situa- 
tions involying a choice between conventional rein- 
forcers and reinforcing brain stimulation. Rats show a 
preterence for the central reinforcement when it is in 
competition with food (Routtenberg & Lindy. 1965: 
Spies, 1965) or water (Falk, 1961; Mogenson, 1969b; 
Morgan & Mogenson, 1966; Phillips, Morgan, & 
Mogenson, 1970; Stutz, Rossi, & Bowring, 197 1), even 
when food- or water-deprived for 24 or 48 hr (Figure 
4). Rats ignored food and water (“self-starved’’) but 
lever-pressed for electrical stimulation of the medial 
forebrain bundle in the region of the hypothalamus 
(Routtenberg & Lindy, 1965). Animals also tolerate in- 
tense painful stimuli (Olds, 1960; Valenstein & Beer, 
1962) or a cold ambient temperature (Carlisle & 
Snyder, 1970) in order to self-stimulate the hypo- 
thalamus. 

Central reinforcement, although usually stronger 
than conventional reinforcement in competition tests, 
does not produce a rigid, inflexible pattern of re- 
sponding. If the incentive characteristics of the alter- 
native are high, the preference for central reinforce- 
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ment disappears. When the alternative was a highly 
palatable saccharine-glucose solution, lever-press rates 
of more than 100 per min were recorded and rats 
showed an equal preference for this solution and elec- 
trical stimulation of the hypothalamus after being 
water- and food-deprived for 22 hr (Phillips, Morgan, 
& Mogenson, 1970). The relative preference for central 
and conventional reinforcement depends on the cur- 
rent intensity (Deutsch, Adams, & Metzner, 1964; Falk, 
1961; Morgan & Mogenson, 1966), the length of water 
deprivation (Deutsch, Adams, & Metzner, 1964; Mor- 
gan & Mogenson, 1966), the duration of the test ses- 
sion, and the palatability of the conventional rein- 
forcer (Phillips, Morgan, & Mogenson, 1970). 

The results of the studies that have just been re- 
viewed suggest that when deprivation period, inten- 
sity, and quality of the reinforcer are manipulated ap- 
propriately, central and conventional reinforcers are 
equally effective in controlling operant responses. 


Partial Reinforcement 


As mentioned earlier in the discussion of method- 
ological problems, partial-reinforcement schedules 
may circumvent some of the difficulties involved in 
using response rate on a CRF schedule as a measure 
of relative reinforcement value. There are, however, 
repeated references in the literature suggesting that 
performance for central reinforcement is poorer than 
for conventional reinforcers on more complicated 
schedules. Sidman, Brady, Boren, Conrad, and Schul- 
man (1955) reported obtaining successful performance 
on a variable-interval (VI) 16-sec schedule and a fixed- 
interval (FI) 7-sec schedule. Brodie, Moreno, Malis, 
and Brodie (1960) found that most of their monkeys 
would not respond to schedules exceeding fixed ratio 
(FR) 20, although one anomalous monkey performed 
an FR 150; Culberton, Kling, and Berkley (1966) re- 
ported that it took four times as long to train animals 
to respond on an FR 10 using central reinforcement 
than with water. 

Other investigators (e.g., Brown & Trowill, 1970: 
Cantor, 1971; Pliskoff, Wright, & Hawkins, 1965) have 
managed to obtain performance for central reinforce- 
ment similar to that for conventional reinforcers by 
slight manipulations of the reinforcement procedure. 
Sidman et al. (1955) were the first to point out that 
performance on these schedules was related to the in- 
tensity of the brain stimulation; they further sug- 
gested that low-current stimulation was comparable to 
small amounts of reinforcement and that to obtain 
stable performance the “amount” of central reinforce- 
ment must be properly equated to conventional rein- 
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forcers. ‘This line of argument was extended by Plis- 
koff and colleagues (Hawkins & Pliskoff, 1964; Pliskoff, 
Wright, & Hawkins, 1965; Pliskoff & Hawkins, 1967), 
who suggested that a standard food pellet was equiv- 
alent to several (5-20) brief trains of brain stimulation 
in terms of reinforcement strength. Using such a mul- 
tiple-stimulation technique it has been possible to 
maintain FI Il-min and VI 1-min schedules (Brown & 
Trowill, 1970), and when combined with the two- 
member chaining procedure that equates temporo- 
spatial factors, performance was maintained on FI 10 
min, FR 200, and differential reinforcement at low 
rates (DRL) 180-sec. 

Cantor (1971) has also been able to obtain stable 
behavior with intermittent schedules (e.g., FR 200, 
VR 30, FI 3-min, VI 2-min, and DRL 20-sec), but us- 
ing only single stimulations without an equating of 
temporospatial factors (Figure 5). The distinguishing 


Rat 294 


MULT: FR 100, 
DRL 20 sec 


Clicker 
oA off 


Rat 294 


VI 2 min 


Rat 297 


Fig. 5. Sample cumulative response curves for two rats. Rat 
294 received a multiple schedule (FR 100, DRL 20-sec) and an 
FR 200; rat 297 is on a VI 2-min schedule. Reinforcement is 
indicated by oblique “pips.” Brain stimulation for rat 297 was 
signaled on the record at A, unsignaled at B, and again signaled 
at C. (From Cantor, 1971. © 1971 by the American Association 
for the Advancement of Science.) 
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feature of his procedure was that the single stimula- 
tion was made predictable by preceding it with a 
brief, exteroceptive warning signal; when the animal 
depressed the bar a light came on and continued for 
1 sec with the central reinforcement automatically 
occurring during the last .5 sec of the light. If this 
signal was withdrawn after mastery of the schedule, 
performance broke down. These results can also be 
viewed as enhancing the reinforcement value of cen- 
tral reinforcement since it has been found that pre- 
dictable central reinforcement (i.e., signaled) is more 
reinforcing than unpredictable central reinforcement 
(i.e., unsignaled) (Cantor & LoLordo, 1970, 1972); the 
precise reason for this increase in reinforcement value 
is not certain. 

It seems, once again, that what was initially con- 
sidered a dramatic difference between central and 
more conventional reinforcers (1.e., maintenance of 
high intermittent schedules) can be viewed as an arti- 
fact of dissimilar procedures. 


Influence of Drive State on Salf-Stimulation 


The effectiveness of food and water and other con- 
ventional reinforcers is enhanced when the animal is 
deprived to induce a central motive state. Is central 
reinforcement similarly enhanced by a central motive 
state produced either by deprivation of the primary 
reinforcer (e.g., food) or by electrical stimulation of 
certain areas of the brain—primarily the hypos 
thalamus? 

Early studies of the effects of food and/or water 
deprivation on self-stimulation rates (e.g., Brady, 
Boren, Conrad, & Sidman, 1957) were not conclusive, 
since the enhancement of rate obtained could have 
resulted from a general effect of deprivation on activ- 
ity rather than a specific interaction between self- 
stimulation and the deprivation. Olds (1958a), how- 
ever, was able to demonstrate with castrated male rats 
that some animals (medial electrode placements) 
showed increased self-stimulation rates when food- 
deprived whereas other animals (more lateral place- 
ments) showed an increased rate only when injected 
with androgen (male sex hormone). Not only is this a 
clear demonstration that general activity changes were 
not responsible for the enhanced response rates, but it 
also suggests that the effect of a particular deprivation 
or need state was specific to the site of stimulation. 

In subsequent studies it was observed that feeding 
(Hoebel & Teitelbaum, 1962; Margules & Olds, 1962), 
drinking (Mogenson & Stevenson, 1966), copulation 
(Cageiula & Hoebel, 1966; Herberg, 1963), and other 
motivated behaviors could be elicited by electrical 
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Fig. 6. Concurrent self-stimulation and elicited drinking for a 
period of 24 hr. The cumulative lever-press responses are shown 


in the top panel. In the low panel cumulative water intake (top 
line) and cumulative urine output (bottom line) are plotted. 
(From Mogenson & Stevenson, 1967 -) 


stimulation of brain sites cllective fer central rein- 
forcement (Figure 6). The rate of self-stimulation of 
“feeding sites’’ was increased during food deprivation 
(Goldstein, Hall, & Templer, 1970; Margules & Olds, 
1962) and following the injection of insulin which in- 
creases appetite (Hoebel, 1969). On the other hand, 
stomach distention, forced obesity, and injections of 
glucagon, procedures which reduce food intake, re- 
duced the rate of selfstimulation of these brain sites.? 
Similarly, castration reduced, and the administration 
of androgens increased, the rate of self-stimulation of 
hypothalamic sites from which copulatory behavior 
could be elicited (Caggiula & Hoebel, 1966; Hoebel, 
1969). 

These studies suggest that central drive or motive 
states influence conventional and central reinforce- 
ment in a similar manner. According to Deutsch 
(1960) the reinforcement system functions in associ- 
ation with a drive system: during self-stimulation, 
both the reinforcement system and the drive system 


2 Water deprivation was reported not to change the rate of 
self-stimulation of hypothalamic ‘drinking sites’ (Mogenson, 
1969a). It is not clear why thirst fails to influence self-stimulation 
whereas hunger and sex drives enhance central reinforcement. 
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are activated concurrently. Hoebel (1969, 1971) has 
explained the results described in the previous para- 
graph by assuming several reinforcement systems—a 
hunger-reintforcement system, a_ thirst-reinforcement 
system, a sex-reinforcement system, etc., each in asso- 
ciation with a particular drive. 


SOME IMPLICATIONS OF COMPARING 
CENTRAL AND CONVENTIONAL 
REINFORCERS 


From the comparisons that have been made thus 
far, it appears that central and conventional rein- 
forcers are very similar if allowance is made for 
methodological and parametric differences in the ex- 
periments on which the comparisons are based. This 
is not to suggest that central and conventional rein- 
forcers are identical. In fact, there is increasing recog- 
nition that conventional, peripheral reinforcers them- 
selves are not identical or even homogeneous (Bolles, 
1970; Shettleworth, 1972; Staddon & Simmelhag, 
1971). How an animal’s behavior is altered by a rein- 
forcer, and what and how it learns, is subject to the 
constraints of its species-specific behavioral organiza- 
tion; as Skinner (1966a) pointed out several years ago, 
behavior is a function of phylogenic contingencies as 
well as ontogenic contingencies. 

Central reinforcement may also not be homogen- 
eous. It has been suggested that there are subsystems 
of reinforcement (Gallistel & Beagley, 1971), and it has 
also been suggested that they may be related to feed- 
ing, drinking, or other drives (Hoebel, 1969) or to 
various species-typical behaviors (Glickman & Schiff, 
1967). Even if there is a single central reinforcement 
system (as assumed by Stein, 1969), it may be activated 
less directly or less strongly depending on the sites of 
stimulation. The electrical stimulation may activate 
other neural systems besides central reinforcement, 
and these ‘“‘side effects’ may alter its effectiveness in 
reinforcing operant responses.? 

In any case, in view of the similarities of central 
and conventional reinforcers it may be asked whether 
there are any advantages in using central reinforce- 
ment which permit a unique contribution to the ex- 


3 Reinforcing electrical stimulation may elicit various sorts of 
respondents such as integrated consummatory responses, forced 
motor movements, autonomic changes (e.g., heart rate and blood 
pressure), as well as endocrine changes (elevated ACTH), some 
of which may be entirely independent of the reinforcing 
character of the brain stimulation. Such spurious effects are be- 
lieved to result from the simultaneous activation of a number 
of closely intermingled neural systems by the relatively indiscrete 
electrical stimulation. 


CENTRAL REINFORCEMENT 


perimental analysis of behavior or to the better under- 
standing of the mechanisms of reinforcement. 

In terms of rate of operant responding and prefer- 
ence behavior, direct stimulation of appropriate brain 
sites is the most potent reinforcer available. It can be 
used without the undesirable contaminating effects of 
conventional reinforcers such as the stress of food 
deprivation or aversive peripheral shock and the sati- 
ating effects of the consummatory behavior. Self- 
stimulation may be considered a “pure operant”; it 
can be maintained for relatively long periods of time 
without excessive ingestion of food or water or other 
consequences which inhibit the operant behavior. It 
is the purest form of incentive whose effects have a 
rapid onset and rapid offset. Therefore, with the use 
of central reinforcement it is possible to have the 
greatest degree of behavioral control and to maintain 
this control for long periods of time. This characteris- 
tic of central reinforcement has made it a valuable 
tool in psychopharmacology and has made possible 
the pioneering studies of biofeedback (see this chap- 
ter’s introduction). 

In the hands of experts trained in operant tech- 
nology, the purity, potency, and insatiability of cen- 
tral reinforcement permit a control of human be- 
havior which is awesome to contemplate. The decision 
about the legitimate use of central reinforcement for 
behavior modification in the treatment of self-destruc- 
tive tendencies and other abnormal behaviors or in 
the training of mentally retarded children will not be 
an easy one; Michael Crichton’s recent novel Termi- 
nal Man (1972) highlights the ethical issues associated 
with the use of central reinforcement for such pur- 
poses. However, there is no question that the neuro- 
behavioral technology is available for an impressive 
degree of control of man’s behavior. 

The other unique feature and advantage of central 
reinforcement is that it permits direct access to the 
mechanisms of the brain that subserve reinforcement. 
Although this may not appear important to those in- 
terested in reinforcement exclusively from the view- 
point of behavioral control, it does excite physiologi- 
cal psychologists and other neuroscientists concerned 
with the neural substrates of reinforcement, motiva- 
tion, and learning. Self-stimulation experiments have 
implicated a number of brain structures which sub- 
serve reinforcement. Sophisticated experiments which 
combine electrophysiological, histological, and neuro- 
chemical techniques with operant techniques are be- 
ginning to elucidate the characteristics of the positive 
reinforcement system and its relationship with mem- 
ory, motor, and perceptual systems (Gallistel, Rolls, & 
Greene, 1969; Rolls, 1972; Smith & Coons, 1970). This 
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active and interesting field of research promises to 
reveal the circuitry and the workings of the neural 
pathways that subserve both central and conventional 
reinforcers. 

At the present time the exact neural basis for cen- 
tral reinforcement is still uncertain in spite of vig- 
orous investigation for nearly two decades. This state 
of affairs has prompted Crow (1972c) to comment, 
“As a bridge between psychological theory and neuro- 
physiology the value of the discovery [of central rein- 
forcement] is limited by the fact that the anatomical 
pathways which must be activated to yield the response 
have net been identified’ (p. 414). However, new de- 
velopments in mapping neural pathways using histo- 
chemical techniques (to provide a chemical neuro- 
anatomy of the hypothalamus and limbic system) may 
lead to the identification of the neural substrates of 
central reinforcement (discussed below). 


THE NATURE OF CENTRAL 
REINFORCEMENT 


An observer who sees an animal self-stimulate its 
brain for the first time is usually very impressed with 
this fascinating phenomenon, Typically he asks, “Why 
does the animal do that?’ What is the nature and the 
mechanism of central reinforcement? 

A number of explanations of selfstimulation have 
been proposed during the last 20 years. However, in 
spite of intensive and often ingenious research it is not 
possible to say with certainty whether any of these 
interpretations is correct, In this final section we first 
consider two of the most popular views and then 
present some speculations about the mechanisms of 
central reinforcement. 


Central Rainforcament from the Viewpaint of Driva 


Theory and Response Reinforcement 


When self-stimulation of the brain was discovered, 
and for several years thereafter, drive-reduction theory 
was the dominant theory of reinforcement and learn- 
ing. It is not surprising that attempts were made to 
explain the self-stimulation phenomenon from this 
theoretical point of view (Miller, 1960), and studies of 
central reinforcement were undertaken with this the- 
ory in mind (Deutsch & Howarth, 19683). 

In one of his earlier papers J. Olds (1956) proposed 
that positively reinforcing brain stimulation “must 
excite some of the nerve cells that would be excited 
by satisfaction of the basic drives—hunger, sex, thirst, 
and so forth” (p. 15). In accordance with drive-reduc- 
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tion theory, Olds was suggesting that central rein- 
forcement is due to the brain stimulation mimicking 
(or activating neural systems concerned with) the re- 
duction of one of the basic biological drives, which in 
turn reinforces the operant response. However, a few 
years later observations were made which suggested 
just the opposite: feeding (Hoebel & Teitelbaum, 
1962; Margules & Olds, 1962), drinking (Mogenson & 
Stevenson, 1966), and copulatory (Caggiula & Hoebel, 
1966; Herberg, 1963) behavior were elicited from the 
same electrode sites that were highly effective for self- 
stimulation. 

Initially it seemed paradoxical that the animal 
pressed a lever to stimulate its hypothalamus making 
it seek food or water. Deutsch (1960) proposed that 
there is a distinctive reinforcement system of the brain 
activated during self-stimulation and that it functions 
in association with a drive system. He did not see any 
paradox, since he assumed that the reinforcement 
system and the drive system had to be activated con- 
currently, In other words, drive induction, produced 
either in a conventional manner (¢.g., water depriva- 
tion) or by electrical stimulation of the brain, is a 
necessary condition for reinforcement. Glickman and 
Schiff’s (1967) attempt to resolve this paradox in- 
volved a departure from drive-reduction theery and 
a different view of both the nature of reinforcement 
and the mechanisms by which feeding and drinking 
are élicited by électrical stimulation of the brain. 
They pointed out that the behaviers elicited from the 
same brain sites as self-stimulation (feeding, drinking, 
gnawing, copulating, ate.) Are species-typical TEsPOnses 
that contribute te adaptation te the envirenment an 
to survival. In the course of biological evolution an- 
imals endowed with the neurological apparatus to 
make these responses will survive and reproduces. Ac- 
cording to Glickman and Schiff, reinforcement results 
from the activation, either by natural stimuli or by 
electrical stimulation of the brain, of ncural pathways 
that initiate species-typical behaviors. They assumed 
that there is a single system which has conventional 
reinforcement and drive properties. In some ways, 
their proposal is a new version of Sheffield’s (Sheffield, 
Roby, & Campbell, 1954) consummatory response 
theory, and they avoid the paradox referred to above 
by avoiding the use of terms like hunger drive or 
drive reduction. Unlike Deutsch, they do not assume a 
distinctive drive system. 

Although self-stimulation and elicited drinking have 
been observed to occur concurrently (Mogenson, 
1969a), the elicited drinking is not necessary for cen- 
tral reinforcement. The administration of ampheta- 
mine, which markedly reduced drinking elicited by 
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hypothalamic stimulation, did not reduce, but rather 
increased dramatically, the rate of self-stimulation of 
the same hypothalamic site (Mogenson, 1968). Fur- 
thermore, by varying the parameters of stimulation, 
the “drive-eliciting”’ and central reinforcement effects 
of hypothalamic stimulation could be dissociated. 
Using a train of hypothalamic stimulation .5 sec or 
longer both self-stimulation and elicited drinking 
were observed, whereas with a train duration of .1-.2 
sec self{-stimulation occurred in the absence of elicited 
drinking (Figure 7). Lesions have also been used to 
dissociate the drive-inducing and the central reinforce- 
ment properties of hypothalamic stimulation; it has 
been shown that lesions of the substantia nigra selec- 
tively disrupt the elicited carrying of objects that oc- 
curred concomitantly with self-stimulation in a shut- 
tlebox but did not interfere with self-stimulation per 
se (Phillips, 1973). Finally, it should be noted that the 
current intensity threshold to elicit feeding and drink- 
ing is below the threshold for self-stimulation when 
stimulus train durations of more than | sec are used 
(Coons & Cruce, 1968; Huston, 1971, 1972; Miller, 
1960). 


Central Reinforcement from the Viewpoint 
of Incentive Motivation 


Electrical stimulation of the hypothalamus may 
elicit drinking and feeding not by activating systems 
for internal deficit signals, as suggested in the previ- 
ous section, but by mimicking incentive stimuli (or 
“appetite-whetting” stimuli) associated with drinking 
and feeding; it is then not a paradox that animals 
self-stimulate “feeding sites” and “drinking sites.” Cer- 
tain stimuli, such as sweet and salty solutions, are 
particularly effective reinforcers (Pfaffman, 1960; 
Young, 1959), even in the absence of biological deficits 
and needs; a sensory stimulus “can function as a rein- 
forcer in its own right” (Pfaffman, 1960, p. 255). Pfaff- 
man has suggested that central reinforcement is due 
to the activation of neural pathways that normally 
transmit such incentive motivational stimuli. 

The results of experiments by Pfaffman and Young, 
as well as those from peripheral self-stimulation experi- 
ments (Campbell, 1971), suggest that brain self-stimula- 
tion results from activating pathways of the brain that 
transmit signals from natural reinforcers. These in- 
clude inputs from smell and taste and other exterocep- 
tive stimuli, but also may include proprioceptive and 
interoceptive inputs. In higher animals “central reward 
pathways” are also activated by cognitive processes. 
Campbell (1971), after noting the pleasure one gets 
from mathematics, science, chess, or crossword puz- 
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Fig. 7. The effect of varying the duration of hypothalamic stim- 
ulation on self-stimulation and elicited drinking. Elicited drink- 
ing occurs only when the duration is greater than 4-.5 sec. The 
fastest rates of self-stimulation occurred with the shortest dura- 
tion of stimulation. (From Mogenson & Stevenson, 1966.) 


zles, maintains that ‘only in the human brain can 
thinking activate the limbic pleasure areas” (p. 22), 
It appears that in man and other animals with com- 
plex brains, reinforcement and motivated behaviors 
depend on natural processes concerned with higher 
cognitive functions as well as those that process intero- 
ceptive, exteroceptive, and proprioceptive inputs. 


Central Reinforcement as the Interaction of Incentive 
Stimuli and a Central Motive State 


It is now generally acknowledged that there was 
too much emphasis in the past on the role of internal, 
deficit signals for the initiation of motivated behaviors 
and the so-called drives. A number of investigators 
have emphasized the importance of identifying and 
studying the influence of the various controlling stim- 
uli for self-stimulation (Lenzer, 1972) and for other 
motivated behaviors (Flynn, Vanegas, Foote, & Ed- 
wards, 1970; Roberts, 1970). There is an increasing 
acceptance of the view that motivated behavior in 
general and reinforcement in particular depend on 
external as well as internal stimuli (Bindra, 1969; 
Mogenson & Huang, 1973). One of the best examples 
of the importance of this interaction of internal and 
external stimuli is the mislabeling of drive states that 
are initiated by internal stimuli in the absence of 
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appropriate environmental signals as reported by 
Schachter (1967). It appears that the central motive 
state that subserves a particular motivated behavior 
depends on both internal, deficit signals and external 
sionals. 

Support for this view comes from studies of the 
effects of sensory stimuli on brain self-stimulation. 
Rats self-stimulated faster when induced by the brain 
stimulation to drink water, apparently because of the 
oral stimulation (Figure 8; see also Mogenson & Kap- 
linsky, 1970), and when the sensory stimulation was 
further increased by adding saccharine to the water 
they self-stimulated at a still faster rate (Phillips & 
Mogenson, 1968): external incentive stimuli enhanced 
central reinforcement. Even when the external stimu- 
lus is not response-contingent, selfstimulation is in- 
creased, as demonstrated when rats self-stimulated 
with a background odor present (Phillips, 1970). It 
appears, therefore, that the external stimulus is not 
enhancing response reinforcement but rather is en- 
hancing or interacting with the central motive state 
induced by the electrical stimulation. Furthermore, 
when the current intensity is below threshold, rats 
will not respond unless water is available, so that oral 
stimulation accompanies self-stimulation (Mendelson, 
1967; Mogenson, Morgan, Phillips, & Stevenson, 1968). 

External stimuli are frequently most effective as 
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Fig. 8. A rat self-stimulates and is induced by the hypothalamic 
stimulation to drink water. At the lower current intensity (15 
pA) the rate of self-stimulation is faster when the animal is in- 
duced to drink. Self-stimulation is facilitated by the oral stimu- 
lation. (From Mogenson & Morgan, 1967.) 
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motivational stimuli when they occur in the presence 
of a central motive state—for example, following a 
period of water or food deprivation. Stimulation of 
central reinforcement sites induces a motive state—it 
elicits feeding (Hoebel & Teitelbaum, 1962) and drink- 
ing (Mogenson & Stevenson, 1966)—so that the animal 
is more responsive to appropriate external stimuli 
(such as the sight of water or smell of food) and its 
behavior is appropriate to the situation. The brain 
stimulation also activates neurons concerned with 
the transmission and processing of external, motiva- 
tional stimuli (Pfaffman, 1960). ‘The rate of self-stimu- 
lation, and apparently the strength of central rein- 
forcement, is increased either by the presence of 
appropriate external motivational stimuli (see previ- 
ous paragraph) or by increasing the central motive 
state (by food deprivation—for example, J. Olds, 
1958a). It follows, therefore, that the event designated 
reinforcement is the occurrence of external motiva- 
tional stimuli in the presence of a central motive 
statc.4 ‘This combination occurs when an animal en- 
counters food or water in the presence of an appropri: 
ate drive state, and a similar combination is produced 
apparently by stimulation of certain brain sites which 
yield central reinforcement. 

Rolls (1972) has obtained unit activity data which 
can be interpreted to support the view that the rein- 
forcement of self-stimulation is primarily associated 
with incentive, motivational stimuli. Rolls reported 
that with self-stimulation from beth the lateral hype- 
thalamus and septum, neurons in the amygdala were 
activated. If stimulus-bound feeding and drinking 
could be elicited from the hypothalamic placements, 
not only was there neural activation of the amygdala 
but in addition cells in the midbrain were activated. 
This midbrain activity did net eccur frem sites which 
maintained self-stimulation only. Rolls has suggested 
that these midbrain neurons are associated with in- 
creased “arousal’—or, in our terms, they contribute to 
the central motive state. One could speculate further 
that activation of the neurons of the amygdala may 
be involved primarily with incentive stimuli, given 
the amygdala’s involvement with sensory input, 
whereas the midbrain structures are more involved 
with the induction of a central motive state. It could 
be argued, therefore, that since self-stimulation oc- 
curred only when amygdaloid neurons were activated, 
central reinforcement depends primarily on activating 
neural elements concerned with incentive motivation. 


4 These views are similar to those of Bindra (1968), who has 
made a comprehensive historical analysis of theories of conven- 
tional reinforcers and the changing emphasis in the treatment 
of motivation. 
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It is difficult, however, on the basis of these data to 
exclude the possibility that such reinforcing brain 
stimulation might also influence the central motive 
state so that the magnitude of the central reinforce- 
ment increases as the electrical stimulation has a 
greater effect on the central motive state. The data 


reviewed earlier which demonstrate increases in self.’ 


stimulation rates with appropriate changes in the 
central motive state would of course support the in- 
volvement of the central motive state in central rein- 
forcement. 


A Model for Central and Conventional 


Reinforcement 


If central and conventional reinforcement are sim- 
ilar, as has been suggested in earlier sections, it should 
be possible to deal with both in relation to the same 
theoretical model. This will be attempted in this sec- 
tion. 

The model shown in Figure 9, which we have 
selected for purposes of illustration, is a slight modifi- 
cation of the one proposed by Milner (1970). It as- 
sumes, as indicated above, that reinforcement results 
from an interaction of appropriate internal and ex- 


incentive ,,) 
+ stimuli Cate + 


(3) 


Program 


Generator 


me drive (2) 
"= stimuli 


Rs Sasi eae Sta eS De Nt i es Stes Ct eed Saat ue es ee | 


(6) 


Fig. 9. A model of central and conventional reinforcement. 
Reinforcement is assumed to result from the interaction of a 
central motive state with appropriate incentive stimuli (see 
text). A central motive state is induced by internal deficit 
(pathway 2) stimuli and by external incentive (conditioned and 
unconditioned, pathway 1) stimuli. Many incentive stimuli have 
at least a weak arousing effect (pathway 3) and also influence 
the central motive state. Central reinforcement according to the 
model is due to the activation of pathway 1 or pathways 1 and 
2 concurrently. When the response-hold mechanism is activated 
by appropriate inputs along the incentive pathway the response 
switch mechanism is inhibited, the response program generator 
sends a fixed command to the motor facilitator, and a stable 
behavior (R,, such as lever pressing for brain stimulation) 
occurs. If this behavior is accompanied by appropriate stimuli 
(e.g., water or saccharine solution in rats that self-stimulate and 
drink concurrently—Mogenson & Morgan, 1967), the central 
reinforcement is facilitated (pathway 5). On the other hand, 
drive-reducing stimuli (gastric distention, elevated blood sugar) 
reduce reinforcement (pathway 6). (Adapted from Milner, 1970.) 
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ternal stimuli. External stimuli that are involved in 
incentive motivation have discriminative properties, 
but they also, along with internal, deficit stimuli, in- 
duce a central motive state. We are defining reinforce- 
ment, as indicated above, as the occurrence of incen- 
tive stimuli in the presence of a central motive state. 
We shall first describe how this model applies to a 
conventional reinforcer (i.e., water) and then go on to 
consider central reinforcement. 

When an animal is deprived, water-deficit signals 
initiate behavioral activation and increased explora- 
tion. According to the model, exploratory activity re- 
sults from these drive stimuli, increasing arousal 
which influences the motor facilitator, and from in- 
creased input to the response switch mechanism, which 
accounts for the variability of behavioral response. If 
the animal encounters water, which is an appropriate 
incentive stimulus, and drinks, the response-hold 
mechanism~ is activated and it in turn inhibits the 
response switch mechanism. The animal will then 
continue to drink, since this behavior keeps it in the 
presence of the incentive stimulus, until the drive 
stimuli are reduced by water intake (or inhibited by 
short-term satiety signals) and the influence of the 
arousal mechanism on the motor facilitator and the 
response-hold mechanism disappears. 

How is central reinforcement explained by this 
model? One suggestion is that central reinforcement 
results from the simultaneous activation of pathways 
that normally transmit incentive stimuli and drive 
stimuli (pathways designated 1 and 2 in Figure 9), a 
proposal that is similar to Deutsch’s hypothesis dis- 
cussed earlier. Initially the animal presses the lever by 
chance and turns on the brain stimulation; the re- 
sponse-hold mechanism is activated and it quickly 
acquires the lever-press response. The response switch 
is then inhibited and the self-stimulation behavior 
may continue for a very long time. If extinction oc- 
curs rapidly, as is often the case when the brain stimu- 
lation ceases, it is because there is no longer input to 
the motor facilitator and the response-hold mech- 
anism. 

The strongest central reinforcement is obtained 
with electrical stimulation of sites of the hypothala- 
mus from which drinking, feeding, and other mo- 
tivated behaviors are elicited. As indicated earlier, the 
rate of self-stimulation is increased during elicited 
drinking (Mogenson & Morgan, 1967) and is increased 
further by adding saccharine to the water (Phillips & 
Mogenson, 1968). ‘The rate of self-stimulation is also 
frequently increased by food deprivation (J. Olds, 
1958a). Assuming that pathways 1 and 2 are being 
activated during concurrent self-stimulation and drink- 
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ing or feeding, we suggest that the oral stimulation 
and the deprivation enhance central reinforcement by 
increasing the inputs along pathways 1 and 2, re- 
spectively, 

Another possibility is that the central reinforce- 
ment is due to activation of only those pathways that 
transmit incentive stimuli (pathway 1). Certain con- 
ventional incentive stimuli seem to be effective rein- 
forcers even in the absence of internal, deficit (drive) 
stimuli (Pfaffman, 1960). Perhaps this is because they 
have at least a weak arousal effect (see postulated path- 
way 3), thereby influencing the motor facilitator as 
well as the response-hold mechanism. 


Central Catecholamine Pathways 
and Self-Stimulation 


The region of the medial forebrain bundle as it 
passes through the lateral hypothalamus (LH) has 
been recognized for some time as the “hotspot” for 
self-stimulation (J. Olds & M. E. Olds, 1964). For a 
time it appeared that this might be because of the 
association of this region with drive systems such as 
feeding, drinking, etc. (Hoebel, 1969); Mogenson, 
1969a). More recent evidence, however, suggests that 
the potent central reinforcement from stimulation of 
this region results from activating ascending nor- 
adrenergic and dopaminergic neurons which are 
densely concentrated here (German & Bowden, 1974). 

‘The noradrenergic and dopaminergic pathways are 
shown in Figure 10, Using histofluorescence tech- 
niques a group of Swedish workers demonstrated that 
these neural pathways project from the midbrain and 
lower in the brainstem through the region of the 
lateral hypothalamus to the basal ganglia, limbic 
forebrain structures, and cerebral cortex (Fuxe, Hok- 
felt, & Ungerstedt, 1970: Ungerstedt, 1971), Self-stimu- 
lation results from stimulation of ascending nor- 
adrenergic neurons in the medial forebrain bundle 
(Stein, Wise, & Berger, 1972) as well as from stimula- 
tion of the locus coeruleus, the site from which fibers 
of the dorsal noradrenergic bundle originate (Crow, 
1972b; Ritter & Stein, 1973) and from stimulation of 
the ventral noradrenergic neurons (Ritter & Stein, 
1974). Self-stimulation has also been obtained from a 
number of sites known to contain dopaminergic 
neurons, such as the substantia nigra (Crow, 1972a; 
Phillips & Fibiger, 1973; Routtenberg & Malsbury, 
1969) and the area adjacent to the intrapeduncular 
nucleus (Dreese, 1966). Furthermore, it has been re- 
ported that dopamine is released from dopaminergic 
terminals during self-stimulation (Arbuthnott, Crow, 
Fuxe, Olson, & Ungerstedt, 1970). 
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Fig. 10. Catecholaminergic pathways demonstrated with the 
histofluorescence technique which project from the midbrain 
and brainstem through the hypothalamus to limbie forebrain 
structures (tub. olfactorium, bulbus olf., septum, n. accumbens. 
n, amygdaloid centralis, hippocampus), basal ganglia (n. cauda- 
tus), and cerebral cortex. The locus coeruleus is at the site of 
the A6 cell bodies ag shown in B. The substantia nigra and the 
interpeduncular nucleus are in the region of the AY and AIG 
cell bodies shown in A, (From Ungerstedt, 1971, 


Experiments involving neuropharmacolopical ma- 
mipulations ef these catechelaminergic pathways alse 
influence self-stimulation, providing additional evi« 
dence for their role in central reinforcement. Drups 
such as alpha-methyl-p-tyrosine, which inhibit cate- 
cholamine synthesis, and reserpine, which deplete 
catecholamine stores, cause a decrement in self-stimu- 
lation (Cooper, Black, & Paolino, 1971; J. Olds, 1956: 
Poschel & Ninteman, 1966; Stein, 1962). On the other 
hand, drugs such as monoamine oxidase inhibitors, 
which increase catecholamine levels, and amphetamine 
or cocaine, which increase the synaptic release and 
block the reuptake of catecholamines, facilitate self- 
stimulation (Crow, 1970; Domino & Olds, 1972; Horo- 
witz, Chow, & Carlton, 1962; M. E. Olds, 1970; Phillips 
& Fibiger, 1973; Poschel, 1969; Stein, 1962, 1964). 
Stein and Wise (1969) maintained that noradrenergic 
neurons have an exclusive role in self-stimulation. 
However, more recent neuropharmacological evidence 
has implicated dopaminergic neurons (Lippa, Antel- 
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man, Fisher, & Canfield, 1973). Convincing evidence 
for the role of dopaminergic neurons has been pro- 
vided in experiments utilizing dopamine blockers and 
agonists. Pimozide and haloperidol, relatively selec- 
tive blockers of dopaminergic receptors when given in 
low doses, cause a marked reduction in self-stimulation 
(Liebman & Butcher, 1973; Wauquier & Niemegeer, 
1972). Rats will lever-press to self-administer, via a 
jugular catheter, a central dopamine-receptor stimu- 
lant, apomorphine. The self-infusion of apomorphine 
apparently reinforces lever pressing because it activates 
dopaminergic synapses of the “central reinforcement 
system” (Baxter, Glickman, Stein, & Scerni, 1974). 

Recently there has been a good deal of interest in 
the physiological and behavioral effects of lesioning 
the noradrenergic and dopaminergic pathways. Since 
such studies suggest the functional relevance of these 
pathways and might provide some clues about their 
role in self-stimulation, they will now be reviewed 
briefly; for a more complete review of this literature 
see Mogenson and Phillips (1975) and Stricker and 
Zigmond (1975). 

When the dorsal noradrenergic and the nigrostriatal 
dopaminergic pathways are damaged or destroyed 
with electrolytic lesions or more selectively with a 
neurotoxin, 6-hydroxydopamine, the tonic alertness 
and phasic behavioral arousal of the animal are dis- 
rupted (Chu & Bloom, 1973; Jones, Bobillier, Pin, & 
Jouvet, 1973). The animal is drowsy and somnolent, 
suffers from sensory neglect and disturbance of sen- 
sorimotor integration, and has difficulty in initiating 
behavior (Fibiger, Zis, & McGeer, 1973; Marshall & 
Teitelbaum, 1973; Stricker & Zigmond, 1975; Unger- 
stedt, 1971). Included in the behavioral deficits is a 
disturbance in feeding and drinking. The dorsal nor- 
adrenergic pathway which projects diffusely to the 
cerebral cortex inhibits cortical inhibitory neurons, 
thereby causing cortical activation and behavioral 
arousal (E. Roberts, 1974). The nigrostriatal dopami- 
nergic pathway projects to extrapyramidal motor 
structures. When these pathways are damaged there 
are deficits in behavioral arousal, extrapyramidal 
motor functions, and affect (Stricker & Zigmond, 1975). 

If these catecholamine pathways have an essential 
role in central reinforcement, as suggested earlier, 
damage or destruction of the pathways should cause 
a severe decrement in self-stimulation. This has been 
demonstrated using 6-hydroxydopamine, the neuro- 
toxin which selectively destroys noradrenergic and 
dopaminergic neurons. When central catecholamine 
neurons are destroyed by administering 6-hydroxy- 
dopamine into the ventricles, self-stimulation of the 
lateral hypothalamus is markedly reduced (Breese, 
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Howard, & Leahy, 1971; Phillips, unpublished ob- 
servations; Stein & Wise, 1971). 

‘The anatomical, neuropharmacological, and lesion- 
ing studies discussed in the previous paragraphs show 
that central reinforcement is associated with stimula- 
tion of brain catecholamine (CA) pathways. We now 
consider the implications of this relationship for 
understanding the mechanisms of brain self-stimula- 
tion. 

As indicated above, the most popular explanations 
of brain self-stimulation were from the viewpoint of 
drive and incentive theories of motivation. It is not 
surprising, therefore, that the first attempt to deal 
with the role of CA pathways in central reinforce- 
ment was in terms of drive and incentive systems. 
Crow (1973) suggested that stimulation of the dopa- 
minergic (DA) nigrostriatal pathway is reinforcing 
because it is involved in the processing of incentive 
stimuli. Olfactory projections go to the habenular 
nucleus and then to the interpeduncular nucleus 
where, according to Crow, they influence the dopa- 
minergic neurons (AQ cells shown in Fig. 10). Self- 
stimulation of the DA pathway results from the 
activation of fibers that transmit olfactory incentive 
stimuli. Crow hypothesized also that the dorsal nor- 
adrenergic (NA) bundle subserves central reinforce- 
ment based on drive reduction. Anatomical evidence 
is again cited in support of the proposal, particularly 
the close relationship between dorsal NA neurons in 
the locus coeruleus (A6 cells shown in Fig. 9) and the 
nucleus of the tractus solitarius which receives gusta- 
tory input. Since gustatory stimuli are closely associ- 
ated with the termination of gustatory behavior, Crow 
suggests that fibers of the dorsal NA pathway are 
activated by stimuli associated with drive reduction. 

Crow’s proposal to account for the role of CA path- 
ways in self-stimulation emphasized sensory stimuli. 
Although olfactory and gustatory signals and path- 
ways are stressed, the model might also be extended 
to deal with visual, auditory, proprioceptive, intero- 
ceptive, and other biologically significant stimuli 
known to be involved in reinforcement (Mogenson & 
Phillips, 1975). Crow’s formulation will appeal to 
those who stress the sensory side of the nervous system 
and for whom the concepts of drive reduction and 
incentive have special significance. However, it does 
not deal with the relationship of the CA pathways to 
the neural systems for the initiation and motor con- 
trol of behavior, a serious shortcoming for those who 
favor the views of Glickman and Schiff (1967) or Mil- 
ner (1970) rather than those of Deutsch and Howarth 
(1963) or Pfaffman (1960). In the final section we con- 
sider the role of the CA pathways in brain self-stimu- 
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lation from the viewpoint of their anatomical and 
physiological relationships with neural systems con- 
cerned with the motor control of behavior. 


Central Reinforcement as the Inhibition by 
Catecholamine Pathways of the Neural 


Systems for the Motor Control of Behavior 


The DA nigrostriatal pathway and the dorsal NA 
pathway project to the striatum, the cerebral cortex, 
and the cerebellum, structures which make important 
contributions to the motor control of behavior (Fig- 
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ure 11). Damage to these pathways, as indicated in 
the previous section, disrupts feeding, drinking, and 
other goal-directed behaviors; one of the most prom1- 
nent deficits is in the initiation of responding. AI- 
though there has been considerable interest in the 
role of these pathways in feeding and drinking, be- 
cause of their strategic location at the interface be- 
tween neural systems concerned with the “intention 
to respond” and those concerned with “motor con- 
trol” it is likely that they contribute to a variety of 
behaviors. 

Motor control of behavior depends on complex in- 
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Fig. 11. The DA nigrostriatal pathway and the dorsal NA pathway project to key strucs 
tures involved in the initiation and motor control of behavior—the striatum, the cerebral 


cortex, and the cerebellum. 


(A.) Parallel systems project from the neocortex to the lower motor neurons of the spinal 
cord via the striatum and cerebcllum,. The striatum, which is in a position to sample activ- 
ity in the motor cortex and cerebellum (via projections through the intralaminar nucleus 
of the thalamus), the limbic system, and the sensory and association cortex, has an 
important role in the translation of the “intention to respond” into the “command sig- 
nals” for motor control, The cerebellum contributes to error detection through pro- 
jections to the motor cortex (via the ventrolateral thalamus) and provides subroutines 
for the execution of the intended movement. (Based on Kemp & Powell, 1971.) 

(B). The DA nigrostriatal pathway and the dorsal NA pathway exert inhibitory effects 
on the striatum, cerebral cortex, and cerebellum. According to Milner (1975), these 
inhibitory effects are the key to the central reinforcement from direct stimulation of 
these pathways, since these effects suppress response-inhibitory systems permitting the 
(reinforced) ongoing behavior to be protected and to continue. (From Mogenson & 


Phillips, 1975.) 
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terrelationships among a number of brain structures 
(Figure 11). It is not our purpose to consider the 
dynamic functioning of these structures and pathways 
(for further details see Kemp & Powell, 1971; Mogen- 
son & Phillips, 1975). Instead we are concerned with 
the possibility that the anatomical and functional 
relationships of the DA and NA pathways with neural 
systems for motor control may provide some clues 
about their role in central and conventional rein- 
forcement. Inhibitory effects are exerted by the DA 
nigrostriatal pathway and the dorsal NA pathway on 
the striatum, cerebral cortex, and cerebellum (Conner, 
1970; Curtis & Crawford, 1969; Segal & Bloom, 1974). 
This may be surprising, since these pathways make 
important contributions to cortical activation and be- 
havioral arousal (Jones et al., 1973; Jouvet, 1972). 
However, the electrophysiological and behavioral data 
are easily accounted for by the proposal that DA and 
NA axon terminals inhibit inhibitory interneurons, 
thereby causing the disinhibition of neurons con- 
cerned with cortical activation and motor control of 
behavior (E. Roberts, 1974). According to Roberts, the 
central nervous system consists of “genetically pre- 
programmed circuits which are released for action by 
neurons (command neurons) that are strategically lo- 
cated at junctions in neuronal hierarchies dealing 
with both sensory input and effector output” (p. 127). 
It has been suggested by several investigators (eg., 
Glickman & Schiff, 1967) that such preprogrammed 
circuits for biting, chewing, swallowing, and other 
components of goal-directed behaviors are represented 
in the brainstem. Roberts suggests that ‘segmental 
command neurons, like the circuits they control, are 
largely inhibited from above, and that a decrease in 
inhibition allows command neurons to fire, thereby 
releasing the preprogrammed circuits over whose ac- 
tivity they preside” (p. 128).5 

Milner (1975) has suggested that the inhibitory 
effects of the DA and NA neurons, and specifically the 
disinhibition they exert on neural systems that initiate 
and control behavior, are the key to understanding 
self-stimulation and central reinforcement. His pro- 
posal will now be presented. 

It is well known that ongoing behavioral responses 
may be interrupted by a novel stimulus. Milner at- 


5 Roberts (1974), after reviewing the relevant evidence, hy- 
pothesizes that fibers of the dorsal NA pathway inhibit 
y-aminobutyric acid (GABA) interneurons in the upper layers of 
the cerebral cortex, thereby influencing electroencephalographic 
cortical arousal and behavior. He also suggests that inhibitory 
GABA neurons in the striatum, which tonically inhibit pre- 
programmed neural circuits for patterned postural control and 
movements, could be inhibited by fibers of the DA nigrostriatal 
pathway. 
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tributes this to the activation of a response-inhibitory 
system in the cerebral cortex.6 The interruption of 
ongoing behavior by sensory stimuli does not occur, 
however, if the behavior is reinforced either by a con- 
ventional reinforcer or by central reinforcement. Ac- 
cording to Milner, activation of the inhibitory cate- 
cholamine pathways, either by direct stimulation in 
the self-stimulation procedure or due to the central 
effects of conventional reinforcers, suppresses the 
response-inhibitory system so that the ongoing be- 
havior is protected and maintained. 

In an earlier section we emphasized a number of 
similarities between reinforcement obtained with 
conventional reinforcers and that from direct stimu- 
lation of the brain. Milner’s ingenious hypothesis sug- 
gests a mechanism common to both conventional rein- 
forcement and central reinforcement. Furthermore, it 
provides a role for drive and incentive stimuli as ac- 
tivators of the catecholamine ‘‘reward’”’ neurons and 
at the same time links these neurons to neural sys- 
tems for motor control. The hypothesis appears to 
provide a fruitful integration and synthesis of the 
previous hypotheses of central reinforcement that 
stressed sensory input (Deutsch & Howarth, 1963; 
Pfaffman, 1960) and response elicitation (Glickman & 
Schiff, 1967). 


SUMMARY 


The discovery of brain self-stimulation in the early 
1950s aroused a good deal of enthusiasm among be- 
havioral and neural scientists interested in the modifi- 
cation of behavior by experience and the underlying 


6 The increased activity and greater persistence of exploratory 
behavior following lesions of certain regions of the cerebral 
cortex and hippocampus indicate that these structures are part 
of a response-inhibitory system. Milner (1976) suggests that 
“arousal of any cortical activity produces a transient response 
inhibition” so that “the immediate effect of the presentation 
of a new stimulus is to interrupt the on-going response.” Ac- 
tivation of the catecholamine pathways protects these responses 
from being suppressed, however, by inhibiting the response- 
inhibitory system. ‘Thus, food, for a hungry animal, not only 
elicits approach, chewing, swallowing, and so on, but sends in- 
hibitory input to [the response-inhibitory system] to ensure that 
cortical outflow does not interfere with the performance.” Simi- 
larly, direct electrical stimulation of the catecholamine pathways 
could prevent cortical outflow from disrupting the responses 
that preceded the central stimulation so that the self-stimulation 
behavior continues. 

Milner’s views could be easily integrated with those of 
Roberts (1974). Although Milner limits his analysis to disinhibi- 
tion of cortical functions, it could be readily extended to include 
disinhibition of the striatum by the inhibitory effects of DA 
nigrostriatal neurons on striatal inhibitory interneurons. In other 
words, the response-inhibitory system postulated by Milner could 
be represented in the striatum as well as the cerebral cortex. 
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modifiability of the nervous system. The potent rein- 
forcement of various operant responses by electrical 
stimulation of the hypothalamus, limbic structures, 
and other regions of the brain seemed to provide a 
means of studying directly the basic mechanisms of 
reinforcement and learning. A number of the subse- 
quent studies appeared to demonstrate, however, some 
important differences between central and conven- 
tional reinforcers—in the rapidity of response acquis1- 
tion, in the persistence and relative insatiability of 
responding, in extinction, when used in partial rein- 
forcement programs, and when presented with secon- 
dary reinforcers. Some of these differences appeared 
to be anomalies when considered in relation to drive- 
reduction theory, the most widely accepted and influ- 
ential theory of motivation and reinforcement at that 
time. Also, some of the experimental findings, such as 
the elicitation of drinking and feeding from self- 
stimulation sites, seemed paradoxical when considered 
from this theoretical point of view; for example, why 
should an animal perform a response which makes it 
thirsty when, according to the theory, reinforcement 
results from drive reduction? 

In recent years, there has been less emphasis on 
drive theory and increasing interest in incentive mott- 
vation. Evidence from self-stimulation experiments 
has contributed to this trend, which has led to a com- 
pletely different way of thinking about self-stimula- 
tion and central reinforcement, as well as about the 
broader field of motivation and conventional rein- 
forcement. At the same time, it was demonstrated that 
the differences between central and conventional rein- 
forcers were not really of a fundamental nature but 
were due primarily to procedural differences, particu- 
larly in the delay of reinforcement and the depriva- 
tion state of the animal, This recent concern with the 
similarities between central and conventional rein- 
forcement, in contrast to the emphasis on differences 
in the past, has revitalized the view that sclf-stimula- 
tion can provide important insights about reinforce- 
ment and motivation, 

Rate of self-stimulation, a widely used although im- 
perfect index of central reinforcement, is increased 
by hunger, the administration of sex hormones, and 
other manipulations that induce or increase a central 
motive state. Selfstimulation is also increased by 
taste, smell, and other relevant incentive stimuli. 
These observations suggest that reinforcement in- 
volves an interaction of a central motive state with 
incentive stimuh. 

Recently it was shown that central reinforcement 
is associated with stimulation of catecholamine path- 
ways of the brain. It has been proposed that the 
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nigrostriatal dopamine pathway transmits incentive 
stimuli, that the dorsal noradrenalin pathway trans- 
mits drive stimuli, and that these form the basis of 
central reinforcement. An alternative hypothesis has 
been derived from the observation that these cate- 
cholamine pathways exert inhibitory effects on the 
cerebral cortex, striatum, and other neural structures 
concerned with the initiation and motor control of 
behavior. The hypothesis was considered that stimula- 
tion of catecholamine pathways is reinforcing because 
they exert suppressive effects on response-inhibitory 
systems of the brain. This hypothesis not only ac- 
counts for conventional reinforcement but also inte- 
erates and synthesizes previous hypotheses of central 
reinforcement that emphasized sensory input and 
response elicitation. 


APPENDIX A 


Some examples of studies of selfsstimulation from 
various sites and species 


PyRIFORM coRTEX: Rat—Olds & Olds, 1963: Cat-O’Donohue 
& Hagamen, 1967. 


CINGULATE cyprus: Rat—Olds & Olds, 1963: Cat-O’ Donohue 
Re Hagamen, 1967. 


HIPPOCAMPUS: Rat—Olds & Olds, 1963: Ursin, Ursin, & Olds, 
1966: Cat-O’ Donohue & Hagamen, 1967. 


Forniai Rat-Olds & Olds, 1963; Cat-O’Donehue & Haga- 
men, 1967; Gerbil-Reuttenberg & Kramis, 1967: Rabbit— 
Brunner, 1966. 


SepruM: Rat—Olds & Olds, 1963: Cat_O’Denohue & Haga- 
men, 1967; Wilkinson & Peele, 1963: Monkey_Bursten 
Ke Delgada, 1958: Lilly, 1957: Man—Heath, 1963: Bishop, 
Fider, & Heath, 1963: Rabbit-Campbell, 1968. 


OLFAGTORY TUBERGLE; Rat—Olds & Olds, 1963: Gat-O’Dene- 
hue & Hagamen, 1967, 


OLPAOTSRY BULE! Rat-Réuttenbers, 1971: Phillips, 1970. 


PREOPTIC: Rat—Olds & Olds, 1963; Olds, Travis, & Schwing. 
1960: Cat-O’Donohue & Hagamen, 1967; Wilkinson & 
Peele, 1963: Monkey—McHugh, Black, & Mason, 1960. 


LATERAL HYPOTHALAMUS: Rat-Olds & Olds, 1963: Cat— 
O’Donohue & Hagamen, 1967; Wilkinson & Peele, 1963: 
Monkey—Briese & Olds, 1964: Rabbit—Brunner, 1966: 
Goat—Persson, 1962: Squirrel-Wetzel & King, 1966; Wet- 
zel, King, & Nowicki, 1967: Dog—Stark & Boyd, 1963; 
Bacon & Wong, 1972: Pigeon—David, Davison, & Web- 
ster, 1972; MacPhail, 1966; Webster & Beale, 1970. 


VENTROMEDIAL HYPOTHALAMUS: Rat—Olds & Olds, 1963: 
Cat—O’ Donohue & Hagamen, 1967. 


MEDIAL FOREBRAIN BUNDLE: Rat—Olds & Olds, 1963; Olds, 
Travis, & Schwing, 1960: Cat-O’Donohue & Hagamen, 
1967; Schnitzer, Reid, & Porter, 1965; Wilkinson & Peele, 
1963: Gerbil—Routtenberg & Kramis, 1967. 
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ZONA INCERTA: Rat—Olds & Olds, 1963: Monkey—Briese & 
Olds, 1964: Squirrel-Wetzel & King, 1966; Wetzel, King, 
& Nowicki, 1967. 


‘THALAMUus: Rat—Olds & Olds, 1963; Cooper & ‘Taylor, 1967: 
Cat-O’Donohue & Hagamen, 1967; Grastyan & Angydan, 
1967: Man-Heath, 1963; Bishop, Elder, & Heath, 1963: 
Squirrel—Wetzel, King, & Nowicki, 1967: Dog—Bacon & 
Wong, 1972. 


CAUDATE NUCLEUS: Rat—Routtenberg, 1971; Olds, ‘Travis, & 
Schwing, 1960: Cat-O’Donohue & Hagamen, 1967; Juste- 
sen, Sharp, & Porter, 1963: Monkey—Briese & Olds, 1964; 
Lilly, 1957: Man—Heath, 1963; Bishop, Elder, & Heath, 
1963: Dog—Bacon & Wong, 1972: Dolphin-Lilly & Miller, 
1962. 


AMYGDALA: Rat—Olds & Olds, 1963: Cat-O’Donohue & 
Hagamen, 1967: Monkey-Briese & Olds, 1964: Man- 
Heath, 1963; Bishop, Elder, & Heath, 1963. 

GLoBUs PALLIDUS: Rat—Olds & Olds, 1963: Cat-O’ Donohue 
& Hagamen, 1967: Monkey—Routtenberg, Gardner, & 
Huang, 1971; Lilly, 1957. 

PUTAMEN: Monkey-Lilly, 1957. 

CENTRAL GREY: Rat—Cooper & Taylor, 1967. 


VENTRAL TEGMENTUM: Rat-—Olds & Olds, 1963; Olds & 
Peretz, 1960; Routtenberg & Malsbury, 1969: Cat— 
O’ Donohue & Hagamen, 1967; Wilkinson & Peele, 1963. 


SUBSTANTIA NIGRA: Rat—Crow, 1972a; Routtenberg & Mals- 
bury, 1969: Cat-O’Donohue & Hagamen, 1967: Monkey- 
Briese & Olds, 1964: Gerbil-Routtenberg & Kramis, 1967. 


RED NUCLEUS: Cat—O’Donohue & Hagamen, 1967. 
LOCUS COFRULEUS: Rat—Crow, 1972b. 


FRONTAL CORTEX: Rat—Routtenberg, 1971; Routtenberg & 
Sloan, 1972: Cat—-O’Donohue & Hagamen, 1967; Wilkin- 
son & Peele, 1963. 


BRACHIUM CONJUNCTIVUM: Rat—Routtenberg & Malsbury, 
1969: Cat—O’Donohue & Hagamen, 1967: Monkey-—Briese 
& Olds, 1964. 


CEREBRAL PEDUNCLE: Rat-Olds & Olds, 1963: Squirrel- 
Wetzel, King, & Nowicki, 1967. 


PALEOSTRIATUM: Pigeon—Goodman & Brown, 1966; Mac- 
Phail, 1966, 1967. 


NEOSTRIATUM: Pigeon—MacPhail, 1966, 1967; Webster & 
Beale, 1970. 


HYPERSTRIATUM: Pigeon—MacPhail, 1966, 1967. 


ECTosTRIATUM: Pigeon—MacPhail, 1966; Hollard & Davi- 
son, 1971. 


PALLIDUM: Goldfish—-Boyd & Gardner, 1962. 
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The Experimental Production 


of Altered Physiological States 


concurrent and contingent 
behavioral models’ 


INTRODUCTION 


The development of laboratory behavioral ap- 
proaches to the production of altered physiological 
states reflects the emergence of two general models for 
their experimental analysis. The more traditional con- 
current model emphasizes the effects of prior-occurring 
or accompanying environmental-behavioral interac- 
tions on physiological processes. The early work of 
Pavlov and Cannon relating autonomic changes to 
environmental antecedents provides classical examples 
of such laboratory studies. Current applications of 
this model have extended the analysis of both respon- 
dent and operant conditioning effects upon a broad 
range of physiological processes (e.g., hormonal secre- 
tions). The more recent contingent model, in con- 
trast, focuses on environment-behavioral interactions 
which follow physiological change and provide con- 
trolling consequences for instrumental conditioning 
effects involving visceral-autonomic responses. Appli- 
cations of this model have shown, for example, that 
both increases and decreases in heart rate can be pro- 
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duced by experimentally-programmed environmental 
consequences (e.g., food delivery and/or shock avoid- 
ance) which follow such autonomic changes. Over the 
decade since the topic of behaviorally-induced physio- 
logical alterations was reviewed for the original vol- 
ume on Operant Behavior (Brady, 1966), a substantial 
literature has emerged on instrumental visceral- 
autonomic conditioning. Both the laboratory and 
clinical descriptions of such phenomena provide con- 
vincing evidence of the extensive behavioral-environ- 
mental influences available for experimental analysis 
within the framework of this contingent psychophysi- 
ological model. 


CONCURRENT MODELS 


Early Studies 


Research in this area has traditionally emphasized 
Pavlovian (1.e., respondent) conditioning processes 
concerned primarily with adjustments of the orga- 
nism’s “internal economy” (e.g., Cannon, 1915; Gantt, 
1960; Pavlov, 1879), and an active experimental inter- 


est in such classical psychophysiology continues to be 
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reflected in contemporary research literature (Black, 
1971; Figar, 1965; Harris & Brady, 1974). By the mid- 
dle of the present century, however, reliable methods 
were developing for direct observation and measure- 
ment of a wide range of physiological processes in 
situations involving both classical respondents and in- 
strumental or operant performances. In the middle 
1960s, a review of this area (Brady, 1966) surveyed 
three major groups of laboratory operant conditioning 
studies related to the experimental production of 
altered physiological states. 


Transient Physiological Changes 


The first group of reports described relatively tran- 
sient cardiorespiratory and neurophysiological changes 
which were for the most part confined to experimen- 
tal periods during which the animal subjects were 
engaged in some required operant performance (e.g, 
Sidman shock-avoidance), and appeared to represent 
physiological responses related to stimuli produced by 
the instrumental behavior. Cardiorespiratory effects 
were found to accompany both appetitively and aver- 
sively maintained instrumental performances, but 
they seldom if ever endured beyond the limits of the 
specific experimental condition which provided the 
occasion for their appearance (Berlanger & Feldman, 
1962; Eldridge, 1954; Hahn, Stern, & McDonald, 1962; 
Malmo, 1961; Perez-Cruet, Black, & Brady, 1963; Perez- 
Cruet, Tolliver, Dunn, Marvin, & Brady, 1963; Sha- 
piro & Horn, 1955; Wenzel, 1961), Central nervous 
system effects described in electrophysiological studies 
involving instrumental conditioning procedures were 
found to be characterized by a similar transience (An- 
liker, 1959: Hearst, Beer, Sheatz, & Galambos, 1960: 
John & Killam, 1959; Porter, Conrad, & Brady, 1959: 
Ross, Hodes, & Brady, 1962) though the findings were 
suggestive of relationships involving behaviorally: 
induced autonomic and somatic changes. 


Durable Physiological Alterations 


The second group of studies called attention to 
more durable endocrinological and visceral-alimentary 
changes, ‘The production of marked obesity in normal 
rats by controlling drinking behavior as a shock avoid- 
ance response (Williams & Teitelbaum, 1956), and the 
experimental elevation of alcohol ingestion levels in 
rhesus monkeys during and for prolonged periods fol- 
lowing exposures to shock-avoidance conditioning 
(Clark & Polish, 1960) were represented as two of the 
more dramatic examples of relatively durable be- 
haviorally-induced alimentary changes. Endocrinolog- 
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ical effects associated with such instrumental condition- 
ing procedures also appeared somewhat less transient 
than similarly induced cardiorespiratory alterations. 
The systematic analysis of such psychoendocrinologi- 
cal relations was extensively described in a series 
of studies providing at least a partial basis for the 
psychophysiological approach which has continued to 
characterize many aspects of the more recent develop- 
ments in this research field (Mason & Brady, 1956; 
Mason, Brady, Polish, Bauer, Robinson, Rose, & ‘Tay- 
lor, 1961; Mason, Brady, Robinson, ‘Taylor, Tolson, 
& Mougey, 1961; Mason, Brady, & Sidman, 1957; 
Mason, Mangan, Brady, Conrad, & Rioch, 1961: 
Mason, Nauta, Brady, Robinson, & Sachar, 1961; Sid- 
man, Mason, Brady, & Thach, 1962). 


Chronic Somatic Effects 


The third and final group of studies surveyed in 
the 1966 review dealt with the role of operant be- 
havior in the production of chronic somatic changes 
of the type associated with gastric ulcers and systemic 
infections. The relatively few studies which had ap- 
peared in this area prior to the first Operant Be: 
havior volume testified to the many difficulties pre- 
sented by an experimental analysis of the behavioral 
factors involved. The observed somatic changes were 
generally irreversible. ‘This imposed severe restrictions 
upon replications within individual animals, a pro- 
cedure which has typically characterized operant re- 
search, In 1956, for cxample, Sawrey and Weisz first 
reported the production of gastric ulcers in rats ex- 
posed to an instrumental “conflict” procedure, though 
subsequent studies (Conger, Sawrey, & Turrell, 1958; 
Sawrey, Conger, & “Turrell, 1956; Weisz, 1957) clearly 
supsested that tviultiple intéracting factors, including 
feed deprivation, “fear,” shock, weight loss, and even 
social experience could not be readily teased apart in 
such “‘conflict”-produced alterations, A subsequent 
series of studies at the Walter Reed Medical Center 
in Washington was concerned with the production of 
peptic ulcers in rhesus monkeys who were exposed to 
recurrent instrumental avoidance performance re- 
quirements (Brady, 1958; Brady, 1963; Brady & Polish, 
1960; Brady, Porter, Conrad, & Mason, 1958; Polish, 
Brady, Mason, Thach, & Neimeck, 1962; Porter, 
Brady, Conrad, Mason, Galambos, & Rioch, 1958) and 
these experiments confirmed the complexity of such 
effects. ‘Iwo additional studies, however, one with 
humans (Davis & Berry, 1963) and one with animals 
(Rice, 1963), were able to provide further support for 
the indicated relationship between instrumental 
avoidance performances and gastro-intestinal changes. 
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And at least two other reports (Rasmussen, Marsh, & 
Brill, 1957; Simson, 1958) had appeared by the mid 
1960’s relating the incidence of infectious and other 
disease processes to instrumental performance require- 
ments involving escape and avoidance in laboratory 
animals. 


RECENT DEVELOPMENTS 


Over the past decade, the endocrine and cardio- 
vascular systems have continued to provide a major 
focus for laboratory behavioral studies within the 
framework of traditional concurrent models for the 
production of altered physiological states. Some atten- 
tion has also been directed to  gastro-intestinal 
processes, central nervous system effects, and related 
physiological states. Significant methodological ad- 
vances related to the continuous measurement and 
recording of blood pressure (Perez-Cruet, Plumlee, & 
Newton, 1966; Swinnen, 1968; Werdegar, Johnson, & 
Mason, 1964) have contributed in an important way 
to the progressive development of better models, In- 
creasing use of long-term preparations (Brady, 1965; 
Findley, Brady, Robinson, & Gilliam, 1971; Forsyth, 
1969; Herd, Morse, Kelleher, & Jones, 1969) has pro- 
vided for more extended observation and experi- 
mental manipulation. Behavioral procedures of estab- 
lished effectiveness, including conditioned suppression 
(Estes & Skinner, 1941) and free-operant avoidance 
(Sidman, 1953) have received increasing attention in 
psychophysiology. Progressively more precise analysis 
of the observed interactions highlights the develop- 
ments to be reviewed in the remainder of this section. 


Endocrine and Cardiovascular Changes 


Systematic increases in plasma 17-hydroxycortico- 
steroid (17-OH-CS) levels during acquisition of con- 
ditioned suppression (i.e., “conditioned emotional re- 
sponse” or CER) have now been described, for 
example, in the rhesus monkey (Mason, Brady, & 
Tolson, 1966). Presentations of a 5-min tone were 
terminated contiguously with brief foot shock (5 ma 
for 0.2 sec); these were superimposed upon a lever- 
pressing performance maintained by food on an inter- 
mittent reinforcement schedule (VI 60”). Only one 
tone-shock pairing was programmed during each of 
seven separate experimental sessions. Each CER “ac- 
quisition” trial was programmed 15 min after the start 
of the animal’s daily lever-pressing session and blood 
samples were drawn immediately before and immed- 
lately after each 30-min session. Progressive decreases 


in the lever pressing rate during tone presentations 
(1.e., conditioned suppression) over the course of the 7 
CER acquisition trials were accompanied by progres- 
sive increases in 17-OH-CS levels (as compared to 
lever-pressing control sessions with no tone-shock 
pairings). Essentially the same CER procedure has 
also been studied in chronically catheterized rhesus 
monkeys monitored for heart rate and blood pressure 
changes during both acquisition and long-term main- 
tenance of conditioned suppression using a 3-min 
clicker presentation as the CS (Brady, Kelly, & Plum- 
lee, 1969). Over the course of the first 8 to 10 clicker- 
shock pairings, all 5 animals in the experiment 
showed consistent and systematic decreases in both 
heart rate and blood pressure during the clicker. How- 
ever, continued daily pairings of clicker and shock, 
superimposed upon the lever-pressing performance, 
produced abrupt and sustained reversals in both the 
direction and the magnitude of the cardiovascular re- 
sponse, usually commencing at about trial 9 or 10 as 
shown in Figure 1. Significantly, these changes were 
observed to persist as large magnitude increases in 
heart rate and both systolic and diastolic blood pres- 
sure in response to the behaviorally suppressing clicker 
presentations for from 50 to 100 daily conditioning 
trials. 

The results of these more extended studies empha- 
size the differential temporal course of the skeletal and 
autonomic components of the CER development. The 
cardiovascular changes (i.e., heart rate and blood pres- 
sure decreases) accompanying the initial conditioning 
trials probably reflected the reduction in motor activ- 
ity (1.e., suppression of lever-pressing rate) which 
developed in response to the clicker over the first 8 to 
10 trials. The later-appearing ‘“‘conditioned cardiac 
respondent” could be considered to have developed 
only after this initial “suppression” stage and to have 
been maintained in the form of sustained cardiovas- 
cular activation (i.e., increased heart rate and blood 
pressure) during clicker presentations over the ex- 
tended course of the experiment. Such an account 
could, to some extent at least, reconcile recurrent con- 
flicting reports regarding the direction of heart rate 
changes in response to the CS in CER studies (De- 
Toledo, 1971; DeToledo & Black, 1966: DeVietti & 
Porter, 1970; Nathan & Smith, 1968; Parrish, 1967; 
Smith & Nathan, 1967, Smith & Stebbins, 1965; Steb- 
bins & Smith, 1964; Sutterer, Howard, Loth, & Obrist, 
1970; Williams, 1969). In addition to the obvious 
problems associated with species differences (i.e., 
studies with monkey subjects generally report heart 
rate increases while rat studies usually describe heart 
rate decreases during conditioned suppression), these 
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Fig. 1. Minute-by-minute changes in blood pressure, heart rate, 
and lever-pressing response rate for Monkey A on successive 
three-min, clicker-shock trials during acquisition of the condi- 
tioned emotional Yesponse. The zero points represent control 
values calculated from the three-min interval immediately pre- 


ceding the clicker. (From Brady et al.. 1969.) 


more recent findings, itvolving biphasic cardioyas- 
cular changes in the course of CER development and 
maintenance, strongly suggest that the temporal 
course of such experimental observations may be an 
important source of variability. More importantly, the 
results of these studies would seem to support the 
view that significant aspects of the behavioral and 
autonomic effects described are not causally depen- 
dent. Rather, it would appear that the cardiovascular 
and skeletal changes are more accurately represented 
as independently conditioned effects of the same ex- 
perimental procedures. ‘This characterization of emo- 
tional conditioning (i.e, CER) would seem to have 
important implications for the validity of theoretical 
formulations which emphasize either the causal inter- 
dependence of behavioral and physiological events in 
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the “emotion” process or the primacy of either one 
(Brady, 1975). 

The extended analysis of behaviorally induced en- 
docrine and cardiovascular changes over the past 
decade has focused even more intensively upon condi- 
tioned avoidance models, predominantly of the free- 
operant or Sidman variety using well-standardized 20- 
sec RS and 2-sec SS intervals. ‘There have been several 
reconfirmations (Brady, 1965, 1967) of the two-fold to 
four-fold elevations in 17-OH-CS levels associated with 
even relatively brief, shock-free experimental ex- 
posures to such avoidance contingencies. Furthermore, 
marked differences in the hormone response were ob- 
served (Mason, Brady, & ‘Tolson, 1966) when the free- 
operant avoidance procedure included a discriminable 
exteroceptive warning signal or when ‘free’ shocks 
were superimposed upon the performance baseline. 
Significantly, the corticosteroid response was consis- 
tently reduced during “discriminated” avoidance ses- 
sions including an exteroceptive auditory stimulus 
presented 5 sec before shock whenever 15 sec had 
elapsed since a previous response, though removal of 
the “warning signal” resulted in the immediate reap- 
pearance of the steroid elevations. Conversely, super- 
imposing ‘‘free’ or unavoidable shocks (at the rate of 
approximately 2 or 3 per hour) upon a well-estab- 
lished avoidance performance without a “warning 
signal” was observed to produce more than a 100% 
increase over the elevated corticosteroid levels evident 
during the regular nondiscriminated Sidman avoid- 
ance procedure. 

Significant advances have also been made during 
the past decade in determining the endocrine and 
cardiovascular consequences of free-operant avoidance 
performance requirements during long-term studies of 
months and even years. A report by Brady (1965), fox 
example, describes the effects of repsated exposure to 
continuous 72-hour avoidance over periods up to and, 
in some cases, exceeding one year, upon patterns of 
thyroid, gonadal, and adrenal hormone secretion in a 
series of five chair-restrained rhesus monkeys. Two of 
the five monkeys participated in the 72-hour avoid- 
ance experiment on six separate occasions over a 6- 
month period with an interval of approximately 4 
weeks intervening between each exposure. The re- 
maining three animals performed on a schedule which 
repeatedly programmed 72-hour avoidance cycles fol- 
lowed by 96-hour non-avoidance or “rest” cycles (3 
days ‘“‘on” and 4 days “off’”’) for periods up to and ex- 
ceeding one year. 

The two animals exposed to repeated 72-hour avoid- 
ance at monthly intervals for 6 months showed a pro- 
gressively increasing lever-pressing response rate with 
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each of the six successive 72-hour avoidance sessions, all 
illustrated in Figure 2. During the initial 72-hour 
avoidance experiment with these two animals, response 
rates averaged 16 and 18 per minute, respectively. Re- 
sponse-rate values for these same monkeys during the 
sixth 72-hour avoidance experiment averaged 28 and 
27 resp/min, respectively. In contrast, shock frequen- 
cies over this same period showed a sharp decline 
within the first two 72-hour avoidance cycles. Hor- 
mone changes related to the repeated 72-hour avoid- 
ance cycles showed consistent and replicable patterns 
over the 6-month experimental period for both 
animals. During the initial experimental sessions, as 
shown in Figure 2, both monkeys showed approx- 
imately three-fold elevations in 17-OH-CS levels dur- 
ing 72-hour avoidance and returned to near baseline 
levels about 6 days afterwards. ‘The remaining four 
monthly experiments were characterized by substan- 
tial, though diminished steroid responses (approxi- 
mately two-fold elevations in 17-OH-CS levels) during 
avoidance, with essentially the same 6-day period re- 
quired for recovery of basal levels. Significant changes 
related to the extended avoidance performance were 
also observed in catecholamine, gonadal, and thyroid 


hormone levels, with recovery cycles extending in 
some instances (thyroid) for 3 weeks following the 72- 
hour avoidance period. A detailed experimental and 
interpretive analysis of such multiple hormone 
changes induced by exposure to the 72-hour Sidman 
procedure has been provided in an exhaustive multi- 
authored monograph (Mason et al., 1968) describing 
this most systematic series of laboratory studies yet to 
appear in the psychoendocrine literature. 

The three remaining monkeys described in the 
Brady (1965) report as performing on the 3 days “on,” 
4 days “off” avoidance schedule showed an initial in- 
crease in lever-pressing response rates for approxi- 
mately the first 10 avoidance sessions similar to that 
seen with the two animals described above. By ap- 
proximately the 20th weekly session with these 
animals, however, lever-pressing response rates during 
the 72-hour avoidance period had decreased to a value 
well below that observed during the initial avoidance 
sessions, and the performance tended to stablize at 
this new level for the ensuing weeks of the experi- 
ment. In contrast, shock frequencies for all animals 
quickly approximated a stable low level within the 
first two or three exposures to the avoidance schedule 
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Fig. 2. Steroid levels, avoidance 
response rates, and shock fre- 
quencies for animals M-736 and 
M-77 during 6 monthly 72-hour 
avoidance sessions. (From Brady, 
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and seldom exceeded a rate of 2 shocks per hour for 
the remainder of the experiment. Food and water in- 
take, however, remained relatively stable throughout 
the entire course of the study. The typical. pattern, 
exemplified by monkey M-157, is illustrated in Figure 
3. ‘The initial 72-hour avoidance sessions were charac- 
terized by progressive increases in lever-pressing and 
elevations in 17-OH-CS levels. In the succeeding 
wecks, 17-OQH-CS levels gradually declined but rose 
again by the 30th week. The general pattern obtained 
with M-157 was replicated with only minor variations 
in the two additional animals on this same experi- 
mental program. Perhaps the most consistent and 
striking observation in all three monkeys was the 
change in responsivity of the pituitary-adrenal system 
to the avoidance stress with continued exposure to 
this procedure over extended time periods. ‘These 
findings are somewhat at variance with the repeated 
observations in many previous acute studies of a close 
positive relationship between steroid clevations and 
avoidance performance. These more extended studies, 
in contrast, suggest that continued exposure to re- 
peated performance requirements on the time sched- 
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ule programmed in this experiment may produce a 
dissociation between the avoidance performance and 
the 17-OH-CS response. Although a definitive analysis 
of such relationships is not possible on the basis of 
these data alone, a critical role of the temporal 
parameters (work-rest cycles) is clearly indicated. Cer- 
tainly, related findings (Mason et al., 1968) on the 
course of recovery for a broad range of hormone 
measures provide additional support for this focus 
upon temporal factors in the experimental analysis of 
behaviorally induced physiological states. 

A trend toward more extended periods of experi- 
mental observation and measurement has also been 
apparent in concurrent avoidance studies focusing 
upon cardiovascular changes, particularly in primates. 
Both rhesus (Forsyth, 1969) and squirrel monkeys 
(Herd, Morse, Kelleher, & Jones, 1969) have been re- 
ported to develop hypertensive blood pressure levels 
with recurrent exposure to free-operant avoidance 
requirements for periods up to and exceeding 12 
months. Chair-restrained baboons (Findley, Robinson, 
& Gilliam, 1971) performing on a discrete-trial fized 
ratio instrumental escape-avoidance procedure, how- 
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Fig. 3. Steroid levels, avoidance 
response rates, shock frequen- 
cies, and food and water intake 
levels for animal M-157 through- 
out 65 weekly 72-hour avoidance 
sessions. (From Brady 1965.) 
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ever, were not found to maintain elevated blood pres- 
sure levels over the year or more during which they 
participated in the study (Findley, Brady, Robinson, 
& Gilliam, 1971). During two six-hour periods each 
day (separated by rest, feeding, and sleep intervals), 
the animals were required to respond on an FR 100 
schedule to terminate a red light presented intermit- 
tently (average interval 5 min) and associated with 
occasional unavoidable shocks. Indeed, the baboons in 
this extended study did show substantial pressure in- 
creases during the actual escape-avoidance perfor- 
mance intervals within the daily experimental sessions, 
and there were some periods during the first several 
months on the program characterized by general eleva- 
tions in both blood pressure and heart rate as illu- 
strated in Figure 4. But a significant differentiating 
feature of the schedule requirements in these latter 
studies generated substantial ratio performances on a 
rather heavy Lindsley manipulandum and the per- 
sistent cardiac output (i.e., heart rate) elevations at- 
tendant upon the recurrent 24-hour exposure to this 
high “work-activity” level do not appear to have been 
a prominent feature of the studies by Forsyth (1969) 
and Herd et al., (1969) involving more conventional 
(1.e., non-fixed ratio) free-operant avoidance proce- 
dures. Indeed, this characteristic of the behavioral- 
cardiovascular interaction pattern may well have 
played a critical role in the long-term return to 
normotensive pressure levels illustrated in Figure 4 
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for the baboons studied by Findley, Brady, Robinson, 
& Gilliam (1971). 

A recent series of studies at the Johns Hopkins 
University School of Medicine on cardiovascular 
changes associated with operant avoidance procedures 
(Anderson & Brady, 1971, 1972, 1973a, 1973b; Ander- 
son, Daley, Findley, & Brady, 1970; Anderson & 
Tosheff, 1973) provides further evidence which is at 
least consistent with the relationship between muscle 
activity and the dynamic interplay of cardiac output 
and peripheral resistance suggested by Findley et al. 
(1971). ‘The focus of Anderson’s studies with dogs has 
been upon continuous monitoring of blood pressure 
and heart rate during free-operant (panel press) shock 
avoidance, and, significantly, during a pre-avoidance 
period of fixed duration systematically programmed to 
precede the required avoidance performance. Under 
these conditions, a unique divergence between heart 
rate and blood pressure changes was observed during 
pre-avoidance intervals up to 15 hours in length, with 
virtually all animals showing a characteristic systolic 
and diastolic pressure increase and either a decrease 
or no change in heart rate. Comparisons involving 
similar performance requirements on a variable inter- 
val food reinforcement schedule revealed a markedly 
different pre-performance cardiovascular pattern char- 
acterized by systematic increases in both heart rate 
and blood pressure. And this differential ‘prepar- 
atory” pattern has now been confirmed both between 
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Fig. 4. Mean systolic —_ blood 
pressure (top three lines in up- 
per section), mean _ diastolic 
blood pressure (bottom three 
lines in upper section), and 
mean heart rate (bottom. sec- 
tion) for baboon Sport plotted 
at approximately weekly inter- 
vals during the course of the 
experiment for each of the 
three major activity cycles. 
(From Brady et al., 1971.) 
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individual animals maintained separately on each of 
the procedures, and “within” the same animal alter- 
nately performing on the avoidance and food rein- 
forcement schedule, as shown in Figure 5. The bar 
graph to the left shows the average blood pressure, 
heart rate, and panel response rate for dog Simon 
during consecutive 10-min intervals for 10 “avoid- 
ance” sessions and illustrates the divergent change in 
heart rate and blood pressure which occurs during the 
one-hour pre-avoidance period. The middle graph 
shows the same measures for the same dog taken dur- 
ing 10 subsequent “food” sessions illustrating the 
characteristic concordance between heart rate and 
blood pressure increases in the course of the preper- 
formance hour. And finally, the bar graph to the right 
shows Simon’s recovery of the pre-avoidance pattern 
during a single “avoidance” session following ex- 
posure to the 10 “food” sessions shown in the middle 
graph. 

Direct measurements of cardiac output in such dogs 
prepared with aortic flow probes during exposure to 
the avoidance performance have indicated that the 
pre-avoidance pressure changes are attributable to in- 
creased peripheral resistance while the pressure in- 
creases during the avoidance performance per se occur 
under conditions of increased cardiac output and de- 
creased peripheral resistance. Additional beta adren- 
ergic blockade studies with the drug propranolol dur- 
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ing the same experimental procedure confirm the role 
of sympathetic arousal in the sustained pressure eleva- 
tions during avoidance but suggest that factors other 
than sympathetic mediation may be involved in the 
progressive pre-avoidance pressure increases (Ander- 
son & Brady, 1973b). 

The results of these experiments establish firm re- 
lationships between a broad range of endocrine and 
cardiovascular response processes and free-operant be- 
havioral performances. Both general and specific sup- 
port for these findings has now been provided by 
numerous published reports with rodents, carnivores, 
and primates (Banks, Miller, &% Ogawa, 1966: Black, 
1959; Black & Dalton, 1965; Brady, 1967, 1969, 1970a, 
1970b, 1971, 1972, 1974, Brady, Anderson, Harris, & 
Stephens, 1973; Brady, Findley, & Harris, 1971; Brady, 
Harris, & Anderson, 1972; Brady & Nauta, 1972; 
Brown, Schalch, & Reichlin, 1971; Brush & Levine, 
1966; Coover, Goldman, & Levine, 1971; Forsyth, 
1968, 1971, 1972; Forsyth & Harris, 1970; Forsyth, 
Hoftbrand, & Melmon, 1971; Frazier, Weil-Melherbe, 
& Lipscomb, 1969; Granger, 1970; Graham, Cohen, & 
Shmavonian, 1967; Higgins, 1971; Hokanson, De- 
Good, Forrest, & Brittain, 1971; Jennings, Averill, 
Opton, & Lazarus, 1970; Jolley, 1970; Kelleher, Morse, 
& Herd, 1972: Krahenbuhl, 1971; Laforge, 1971; Law- 
ler, Meyers, & Obrist, 1972; Levine, Gordon, Peterson, 
& Rose, 1970; Malcuit, Ducharme, & Berlanger, 1968; 


L Bee Heur_] —ee Neser 


PRE-AVOIDANCE AVOIDANCE 


Fig. 5. Average blood pressure, heart rate, and panel response rate during consecutive 
10-min pre-performance and performance intervals for 10 “avoidance” sessions (left 
panel), 10 “food” sessions (middle panel), and one additional “avoidance” session (right 
panel) following exposure to the 10 “food” sessions (middle panel) with the same 


dog (Simon). (From Brady et al., 1973.) 
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Miller, Banks, & Caul, 1967; Miyata & Soltysik, 1968; 
Morse, Herd, Kelleher, & Gross, 1971; Rose, Mason, & 
Brady, 1969; Soltysik & Kowalski, 1960; Stern & Word, 
1962; Stoyva, Forsyth, & Kamiya, 1968; Swadlow, 
Hosking, & Schneiderman, 1971; Vanderwolf & Van- 
derwart, 1970; Weiss, Stone, & Harrel, 1970). In many 
respects, the changes in absolute levels of selected 
hormones and autonomic activity can be viewed as 
reflecting relatively undifferentiated consequences of 
arousal states associated with behavioral responses 
under aversive conditions. The reliable temporal 
course of visceral and steroid changes under such 
conditions and the quantitative relations between de- 
gree of behavioral involvement and at least short-term 
physiological response levels has been well docu- 
mented. In addition, the organism’s behavioral history 
would appear to be critical in determining the nature 
and extent of autonomic-endocrine response to such 
performance situations. Clearly however, the most 
meaningful level of analysis for such hormone and 
visceral response processes in relationship to more 
chronic emotional interactions would appear to be the 
broader patterning or balance of secretory and visceral 
change in many interdependent autonomic and en- 
docrine systems which in concert regulate metabolic 
events. Indeed, it would appear that such autonomic- 
endocrine response patterns can be usefully differen- 
tiated in relationship to the historical and situational 
aspects of behavioral events. This differential analysis 
may well provide a first step in the direction of iden- 
tifying distinguishable intraorganismic consequences 
which are associated with both episodic and persistent 
behavioral interactions. 


Gastrointestinal Effects 


While the obvious constraints imposed upon lab- 
oratory studies of behaviorally induced somatic path- 
ology have continued to limit the range of experi- 
mental activity in this general area over the past 
decade, the recent research literature does reflect an 
abiding concern with the effects of environmental 
interactions upon the gastrointestinal system (Ader, 
1971; Smith & Hain, 1970). Of particular interest 
would seem to be the rather extended analysis of fac- 
tors which influence the incidence of peptic ulcers in 
rodents and primates under aversive behavioral con- 
trol. Some further support for the efficacy of “conflict” 
and related procedures in the production of gastric 
lesions in laboratory rats has been provided by studies 
focusing upon approach-avoidance methods and com- 
parisons involving individual and group “stress” ex- 
posure (Lower, 1967; Sawrey, 1964; Sawrey & Long, 


1962; Sawrey & Sawrey, 1963, 1964a, 1964b, 1966), but 
replication and confirmation of the reported relation- 
ships continue to present problems (Ader, Beels, & 
Tatum, 1960; Ader, Tatum, & Beels, 1960; Pare, 
1964). Similarly, recurrent descriptions of avoidance 
performance effects upon the gastrointestinal system 
have characteristically presented something less than a 
consistent picture with regard specifically to the con- 
ditions under which ulcers are most likely to occur. 
The reported incidence of peptic ulcers in rhesus 
monkeys intermittently exposed to free-operant shock- 
avoidance requirements (Brady, Porter, Conrad, & 
Mason, 1958) has proven difficult to repeat under 
some laboratory conditions (Folz & Miller, 1964) in- 
cluding those under which the study originated 
(Brady, 1964). Additionally, several investigations 
with laboratory rats in escape-avoidance situations 
have failed to find an incidence of gastric lesions in 
experimental animals which exceeded that of controls, 
and in some instances, yoked control animals receiv- 
ing unavoidable shocks showed a greater degree of 
ulceration than their avoiding partners (Moot, 
Cebulla, & Crabtree, 1970; Pare, 1971; Weiss, 197 la). 

To some extent, a clarification and at least partial 
reconciliation of these apparently conflicting develop- 
ments in the delineation of behavioral effects upon 
the gastrointestinal system has been suggested by 
Weiss in a systematic series of published experimental 
reports from the Rockefeller University in New York 
(Weiss, 1970, 1971a, 1971b, 1971c). These studies 
started with the observation that laboratory rats which 
received intermittent tail shock following presentation 
of a 10-second beeping tone developed significantly 
less gastric ulceration than animals receiving the same 
shock without the “predictability” provided by the 
pre-aversive “warning”’ stimulus. Weiss then examined 
the effects of adding an operant escape-avoidance 
(“coping’’) panel-press to the procedure. Under these 
conditions, markedly fewer gastric lesions were found 
in the experimental animals when compared with 
“helpless” controls similarly exposed to warning sig- 
nals and shocks (1 per min for 21 hours) but without 
escape-avoidance “‘coping.” The interactions between 
warning signals and the escape-avoidance responses 
were tested in a subsequent experiment in which rats 
received electric shock that was preceded by either a 
warning signal, a series of signals providing an “ex- 
ternal clock,” or no signal at all. Under all three con- 
ditions, animals which could avoid and/or escape 
shock developed less ulceration than did yoked “help- 
less” animals. In addition, there was a clear difference 
in favor of the warning signal condition reducing ul- 
ceration as compared to the no-signal controls regard- 
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less of whether they were “‘helpless” or could escape / 
avoid the shock. 

On the basis of this rather mammoth 180-rat ex- 
periment, Weiss theorized that the incidence of peptic 
ulcers may be a function of the interaction between 
strength of the escape-avoidance performance (ie., 
the frequency of ‘coping’ responses) and the prob- 
ability of discriminable response-contingent signals 
associated with the absence of aversive stimuli (1.e., 
“feedback” about shock-free conditions). In these 
terms, the incidence of peptic ulcers in monkeys per- 
forming on free-operant avoidance is accounted for by 
the fact that they respond frequently in the absence 
of warning stimuli, and that the “safe” signals pro- 
duced by these responses are not readily discriminable 
as “feedback” stimuli, The yoked-control monkeys, in 
contrast, characteristically emitted ‘avoidance’ re- 
sponses only infrequently, received only a few shocks 
well-distributed in time (due to the high performance 
rates of the experimental animals), and were found to 
be free of gastrointestinal pathology. Some further 
confirmation of this formulation has been provided by 
Weiss in a subsequent series of experiments which 
showed that the frequency of ulcers was increaséd in 
avoidance rats punished with shock for responding 
(i.e., “coping” response in high strength plus weak 
“feedback” about shock-free conditions), and decreased 
in animals producing a brief tone with each shock- 
postponing panel-press (i.e., strong “feedback” about 
shock-free conditions), 

It is perhaps worth noting (though a bit out of 
place in the “physiological systems” orgatiuzation of 
this chapter) that Weiss and his colleagues at the 
Rockefeller University have also found hormonal and 
body weight changes which reflect the imteractions 
suggested by the ulcer studies. and that levels of brain 
norepinephrine are increased in escape-avoidance 
animals and decreased in non-performing (“helpless’’) 
shocked rats (Weiss, 1972; Weiss, Stone, & Harrel, 
1970). 


Central Nervous System Interactions 


Electrophysiological changes continue‘to provide a 
hazy focus for a handful of studies over the past 
decade involving the use of operant methodology in 
the analysis of behaviorally-induced central nervous 
system alterations. Sidman avoidance performance in 
the rat, for example, has been reported (Bremner, 
1964) to change irregular hippocampal EEG activity 
such that regular 5-7 cps theta activity appears just 
prior to and during lever pressing. Significantly, how- 
ever, alteration of this theta pattern by direct elec- 
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trical stimulation did not disrupt the concurrent 
avoidance behavior, indicating clearly that the corre- 
lated hippocampal theta activity was not essential to 
the performance. Food-maintained instrumental pedal- 
pushing in the dog has also been found to produce an 
increase in hippocampal theta rhythms accompanied 
by a highly correlated acceleration in heart rate 
(Konorski, Santibanez-H, & Beck, 1968). Large ampli- 
tude spindle electrocorticogram activity topograph- 
ically restricted to the parieto-occipital region has as 
well been reported in the cat during lever pressing on 
a variable interval schedule for milk reward (Marczyn- 
ski, Rosen, & Hackett, 1968), and systematic changes in 
the cortical evoked activity (auditory and visual cor- 
tex) of the cat have been related to the rate of dis- 
criminative avoidance acquisition (Saunders, 1971). 
Several studies involving depth recording of both 
single and multiple neural unit activity during oper- 
ant performances have been reported in the brain 
physiology literature over the past several years. Single 
neurons in the midbrain tegmentum of rats have been 
shown to respond discriminatively (i.c., increased fir- 
ing rates) to tones following a lever press, signaling 
food, water, or no reinforcement under differential 
deprivation conditions (Phillips & Olds, 1969), EEG 
potential and amplitude changes in the reticular 
formation and amygdala of Wistar rats have also been 
shown to differ as a function of food reward and non- 
reward during a discriminated lever pressing perfor- 
mance using auditery signals (Norton, 1970), Multiple 
unit activity recorded from implanted monopolar 
macroclectrodés in the reticular formation, thalamus, 
cochlear nucleus, inferior colliculus, medial gemic- 
ulate, and auditory cortex of the cat during discrim- 
inative instrumental avoidance training with tone as 
a warning signal has also been shewn to refisst the 
sequential development of neuronal conditioning 
(Halas, Beardsley, & Sandlie, 1970). The process was 
observed to start with the reticular formation and 
progress upward from the cochlear nucleus to the 
auditory cortex. While one would be hard put to sup- 
port an interpretation of these findings in terms of 
direct operant control of such neurophysiological 
events, some evidence has been produced for dis- 
tinguishable neural activity patterns during classical, 
instrumental, and discrimination learning using the 
same multiple unit model (Beardsley, 1969). In this 
regard, for example, a recent study of EEG discrimi- 
nators of delayed matching to sample behavior in 
Macaca nemestrina has shown that coherence values 
associated with correct responses were generally higher 
than those for incorrect responses in the frequency 
bands 1-3, 3-4, 5-7, and 8-13 Hz (Campeau, Adey, Dur- 
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ham, Tolliver, Ringler, & Kanner, 1971). The authors 
suggest that such elevated electrophysiological coher- 
ences may be a condition for optimal performance 
under these operant circumstances. 

The relationship between electrical signs of “expec- 
tancy’’ in the brain as reflected in the EEG contingent 
negative variation (CNV) wave form (Walter, 1966) 
and operant performance requirements has received 
increasing experimental attention in several recently 
reported studies (Delse, Marsh, & Thompson, 1972; 
Donchin, Gerbrandt, Leifer, & Tucker, 1972; Peters, 
Knott, Miller, VanVeen, & Cohen, 1970; Rebert, 1972; 
Tecce, 1972). Without belaboring the details of the 
several stimulus and response variable analyses (e.g., 
task difficulty, pitch discrimination accuracy, reaction 
time) which have been described, the general conclu- 
sion that a widespread and protracted negative poten- 
tial in the frontal cortex can be associated with dis- 
criminative stimulus control of minimal instrumental 
response tendencies (Hefferline, 1958) seems now to 
be well established both clinically and experimentally. 
Quantitative relations between the CNV measure as a 
predictor and the properties of the operant response 
(1.e., accuracy, latency, etc.) however, remain to be 
worked out in more precise detail. The range and 
variability of such performance-related brain electrical 
activity changes must also be extended to include the 
“contingent positive variation” wave form (large 180- 
300 microvolt positive steady potential shift associated 
with high-voltage alpha activity over the posterior 
marginal gyrus) reported to be associated with re- 
sponse-produced food reinforcement in deprived cats 
(Marczynski, York, & Hackett, 1969). 

A few reports over the past several years have sug- 
gested that behaviorally induced changes in the chem- 
ical constitution of the brain may be systematically 
related to operant performance requirements under at 
least some aversive control conditions. The work of 
Hyden as reviewed recently by Deguchi (1969) in 
relationship to biochemical research on learning gen- 
erally supports the finding that RNA in Dieter’s 
nucleus of the rat is both increased and changed in 
composition as a function of wire-climbing, and that 
similar changes are observed in cortical cells when 
transfers in “handedness” training are required. In- 
strumental escape and avoidance training have also 
been reported to increase the incorporation of uridine 
into polyribosomes of the mouse brain and produce 
higher poly- and monosome ratios as compared to 
both yoked (receiving shocks only) and non-yoked con- 
trol mice (Uphouse, MacInnes, & Schlesinger, 1972a, 
1972b). It also seems possible that the shock-avoid- 
ance-induced hyperthermia in instrumentally trained 


rats recently described by Delini-Stula (1970) as en- 
during over an extended series of extinction trials may 
be related directly or indirectly to such changes in 
brain chemistry. 


CONTINGENT MODELS 


Technological and Methodological Developments 


Research over the past decade concerned with the 
effects of conditioning procedures on physiological re- 
sponses has focused prominently upon contingency 
relationships between antecedent visceral and glan- 
dular changes on the one hand, and experimentally 
programmed environmental consequences (e.g., food 
delivery and/or shock avoidance), on the other. Stud- 
ies within the framework of this instrumental condi- 
tioning paradigm have clearly emphasized some 
physiological response systems (e.g., cardiovascular) 
more than others. This uneven distribution of 
measures reflects the technological and methodolog- 
ical developments which have paced the emergence of 
an operational laboratory psychophysiology. It is ap- 
propriate to acknowledge at least some of the major 
Innovations and refinements upon which this bur- 
geoning research domain depends. Particularly note- 
worthy have been the technical advances in the re- 
cording and measurement of heart rate (Blizard & 
Welty, 1970; Brener, 1965; DeToledo & Black, 1965; 
Ferraro, Silver, & Snapper, 1965; Fitzgerald, Var- 
daris, & Teyler, 1968; Krausman, 1970; Pare, Isom, & 
Reus, 1970; Perez-Cruet, Tolliver, Dunn, Marvin, & 
Brady, 1963; Ramsay, Pomerleau, & Snapper, 1968), 
blood pressure (DiCara, Pappas, & Pointer, 1969; For- 
syth & Rosenblum, 1964; Herd & Barger, 1964; Kraus- 
man, 1969; Krausman, Ehrlich, & Brady, 1972; Perez- 
Cruet, Plumlee, & Newton, 1966: Swinnen, 1968: 
Werdegar, Johnson, & Mason, 1964), hormone levels 
(Mason et al., 1968), and electromyographic activity 
(Dixon, DeToledo, & Black, 1969; Hefferline, Keenan, 
Hartford, & Birch, 1960) in both restrained and free- 
moving animals. As these and other psychophysiolog- 
ical developments (Brown, 1967) have increased the 
ease and accessibility of visceral and autonomic 
measurement techniques, an ever-broadening range of 
biological events has been exposed to experimental 
scrutiny in relationship to behavioral conditioning 
procedures. 


Early History 


Studies concerned with the analysis of instrumental 
visceral-autonomic conditioning represent a relatively 
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recent development in the experimental production of 
altered physiological states with laboratory roots orig- 
inating in the work of Neal Miller and his students 
at Yale in the mid-1960s, (e.g., Miller, 1969). Indeed, 
several earlier reports with human subjects (Crider, 
Shapiro, & Tursky, 1966; Fowler & Kimmel, 1962; 
Johnson, 1963; Kimmel & Hill, 1960; Kimmel & Kim- 
mel, 1963; Lisina, 1958; Razran, 1961; Shapiro, 
Crider, & Tursky, 1964; Shearn, 1962) had foretold of 
such “operant” learning effects involving visceral 
and autonomic processes, and an extensive literature 
on “voluntary” physiological control by Yoga medita- 
tion and breathing techniques (Anand & Chhina, 
1961: Anand, Chhina, & Singh, 1961; Bagchi & Wen- 
ger, 1959; Wenger & Bagchi, 1961: Wenger, Bagchi, & 
Anand, 1961) has long been available. But the experi- 
mental analysis of such instrumental autonomic condl- 
tioning effects in the animal laboratory has clearly 
activated a new research area in the investigation and 
application of such “visceral learning” phenomena. 


Recent Past Reports 


The earliest reported animal learning experiments 
on instrumental autonomic conditioning involved an 
attempt by Miller and Carmona (1967) to change the 
rate of salivation in a water-deprived dog by reinforc- 
ine both increases and decreases in this antecedent 
autonomic response with a contingent environmental 
consequence (i.e., water reward). Although the results 
of this study showed clearly that such autonomic re- 
sponses could be controlled by operant procedures, 
attention was focused upon the possible role of 
skeletal muscle activity as a “mediator” of the ob- 
served visceral changes. Since the curarization tech- 
nique used to control.such skeletal muscle mediation 
produced direct effects upon salivation, an experiment 
by Trowill (1967) explored the operant control of 
heart vate in curarized laboratory rats using rewarding 
electrical brain stimulation (medial forebrain bundle 
at the level of the posterior hypothalamus) as a rein- 
forcing consequence. Although the actual changes 
were small, both increases and decreases in heart rate 
were successfully conditioned. A subsequent study by 
Miller and DiCara (1967) showed that the magnitude 
of the instrumentally conditioned heart rate response 
could be influenced dramatically (producing changes 
approximating 20% of the basal values) by a “‘shap- 
ing” procedure which required the animals to meet a 
progressively more difficult criterion in order to ob- 
tain the rewarding brain stimulation. In addition, this 
experiment also demonstrated that such operant 
autonomic changes could be brought under discrim- 
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inative control of the specific stimulus complex which 
provided the occasion for reinforcement of the re- 
sponse either of raising or lowering the heart rate. In 
a further confirmation of this instrumental heart rate 
conditioning effect, Hothersall and Brener (1969), us- 
ing curarized rats and electrical brain stimulation re- 
ward, incorporated a feedback light whenever the 
prescribed criterion was met, and extended their 1n- 
vestigation to include a demonstration of operant 
extinction when the instrumentally conditioned heart 
rate response was no longer reinforced with brain 
stimulation. 


Mediational Events 


Persistent concern with the degree of skeletal in- 
volvement in such instrumental autonomic condition- 
ing was reflected in a series of experiments by Black 
(1967a) with dogs initially trained on a lever-pressing 
shock-avoidance task and subsequently curarized for 
operant conditioning of either electromyographic or 
heart rate responses. ‘Ihe level of curarization insured 
little or no overt movement but did not completely 
eliminate the EMG response, The results showed that 
the instrumentally conditioned heart rate changes 
were closely associated with the conditioned EMC 
changes which readily transferred to affect the per- 
formance in the non-curarized state. In a later report. 
however, Black (1968) concluded that the heart rate 
response could be conditioned independently of overt 
movement, without conceding that the operant 
autonomic changes occurred in the absencé of some 
central event related to the initiation and perfer- 
mance of skeletal motor responses. Miller and DiCara 
(1967) had in fact hypothesized that such éential 
activity (eg, motor cortex impulses), classically con- 
ditioned to elicit heart rate changes, might account 
for the demonstrated autonomic effécts. A é6ntesl prs- 
cedure, however, involving strong tail shocks (8 ma) 
produced smaller heart rate increases (10%) than did 
instrumental conditioning (20%), suggesting that such 
indirect skeletal mediation was unlikely to account for 
the observed heart rate changes, They further showed 
(DiCara & Miller, 1968a) that the completely curarized 
rat (i.e, no EMG responses from the gastrocnemius 
muscle) could learn to both increase and decrease 
heart rate as an operant shock-avoidance response, 
thus establishing that the instrumental autonomic 
conditioning effect was not an artifact of the electrical 
brain stimulation reinforcer. The results of this study 
also showed that discriminative control over the con- 
ditioned heart rate change could be developed and 
maintained by a stimulus which always preceded 
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shock presentation. Such instrumentally learned heart 
rate changes have also been shown to persist in the 
absence of reinforced practice trials over extended 
periods (e.g., 3-month retention tests), and they can be 
relearned after extinction (DiCara & Miller, 1968d). 


Response Specificity 


Transfer of learning effects have also been demon- 
strated in a series of studies by DiCara and Miller 
(1969a, 1969b) in which heart rate changes instru- 
mentally conditioned under curare were subsequently 
(one week later) observed in free-moving rats, and 
additional training in the noncurarized state was 
shown to produce changes of even greater magnitude. 
Similar transfer was also demonstrated from the non- 
curarized to the curarized state, and the differences in 
respiration and gross movement were found to de- 
crease as the differences in heart rate increased. This 
emergent response specificity has also beeri docu- 
mented in an experiment by Miller and Banuazizi 
(1968) in which independent operant control of both 
heart rate changes and intestinal contractions was 
demonstrated. Fields (1970) subsequently demon- 
strated the remarkable specificity of such conditioning 
effects by producing instrumentally learned increases 
or decreases in the P-R interval of the EKG inde- 
pendently of changes in the P-P interval. The issues 
related to a linkage between somato-motor and cardio- 
vascular activities in the instrumental autonomic con- 
ditioning process have not been definitely settled, 
however. This is seen from a recent report by 
Goesling and Brener (1972) which showed that two 
different training procedures (i.e., immobility train- 
ing versus activity training), given prior to instru- 
mental heart rate conditioning under curare, can have 
a greater effect upon heart rate changes in subse- 
quently curarized rats than the reinforcement con- 
tingencies per se. Other types of training (e.g., lever 
pressing) given prior to the instrumental conditioning 
of heart rate in rats have appeared related to subse- 
quent performance in heart rate conditioning studies 
(Miller & DiCara, 1967), but negative findings (1.e., no 
relationship between prior bar press training and sub- 
sequent heart rate conditioning) have also been re- 
ported (Slaughter, Hahn, & Rinaldi, 1970). 


Response Interactions 


Interaction effects involving operant heart rate con- 
ditioning and other related psychophysiological 
processes have been investigated in a number of 
animal studies. Curarized rats pretrained using oper- 


ant methods to decrease heart rate have, for example, 
been shown to subsequently acquire (in the non- 
curarized state) shuttle-box escape-avoidance behavior 
more readily than rats similarly pretrained to increase 
their heart rate (DiCara & Weiss, 1969). It has also 
been reported (Engel & Gottlieb, 1970) that blood 
pressure changes were significantly positively corre- 
lated with heart rate decreases instrumentally condi- 
tioned as an avoidance response in rhesus monkeys, 
but such blood pressure effects were uncorrelated with 
instrumentally conditioned heart rate increases in 
these same animals. Differences in the opposite direc- 
tion have been reported with respect to epinephrine 
and norepinephrine by DiCara and Stone (1970) who 
found higher endogenous cardiac and brainstem 
catecholamine levels in rats instrumentally trained to 
increase heart rate as compared to rats instrumentally 
conditioned to decrease heart rate. Cardiac H3-norep- 
inephrine retention studies by these same authors, 
however, suggested that rats trained to decrease heart 
rate under curare were subjected to greater stress than 
rats trained to increase heart rate. Of additional inter- 
est in this regard is the finding by DiCara, Braun, & 
Pappas (1970) that an intact neocortex is essential for 
instrumental autonomic conditioning though this ap- 
pears not to be the case with respect to the classical 
conditioning of the same heart rate and gastrointesti- 
nal responses. 

The instrumental conditioning of blood pressure 
in the curarized rat was convincingly demonstrated 
by DiCara and Miller (1968b) using a shock avoidance 
procedure to reinforce both increases and decreases in 
systolic pressure levels independently of changes in 
heart rate and rectal temperature. And in a subse- 
quent study, these same investigators (1968c) using 
electrical brain stimulation as a reinforcer with the 
curarized rat dramatically confirmed the specificity of 
such instrumental autonomic learning by selectively 
conditioning vasomotor tone increases in one ear and 
vasomotor tone decreases in the other ear of the same 
animal. Significantly, these conditioned blood flow 
changes were not correlated with heart rate, rectal 
temperature, or vasomotor tone in the tail, suggesting 
a remarkable and previously unrecognized localization 
of sympathetic action. Following a replication of these 
findings, Pappas, DiCara, and Miller (1970) further 
demonstrated that instrumentally conditioned systolic 
blood pressure increases and decreases in non-cur- 
arized rats did not transfer to the curarized state, but 
that retraining the same animals after curarization 
produced even larger magnitude pressure changes than 
in the noncurarized state. Similar observations with 
respect to the specificity of instrumentally conditioned 
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cardiovascular changes previously reported by DiCara 
and Miller (1968a, 1968b) received additional support 
from the finding of Pappas et al. (1970) that the in- 
strumentally conditioned blood pressure effects were 
independent of heart rate and gross skeletal activity. 


Response Magnitude and Duration 


Large magnitude diastolic blood pressure elevations 
(50-60 mm Hg) conditioned instrumentally as a shock 
avoidance response in the rhesus monkey were first 
reported by Plumlee (1969), though the relatively 
short duration of the changes and the observed 
postural effects suggested mediation by a Valsalva 
maneuver (i.¢., alteration of intrathoracic pressure by 
abdominal muscle contraction). Somewhat more 
modest elevations in mean arterial pressure (e.g., 25 
mm He) were maintained in squirrel monkeys by Ben- 
son, Herd, Morse, and Kelleher (1969) for periods of 
20 min or longer as a result of an operant reinforce- 
ment contingency arrangement which required the 
indicated pressure change as a shock avoidance re- 
sponse. And Harris, Findley, and Brady (1971) have 
also shown that substantial elevations in both systolic 
and diastolic blood pressure (e.g., 50-60 mm He) 
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could be established and maintained in baboons by 
an operant conditioning procedure which provided 
for food delivery and shock avoidance programmed 
as environmental consequences contingent upon pre- 
scribed increases in diastolic pressure levels. ‘The in- 
strumentally conditioned blood pressure changes were 
sustained for intervals up to and exceeding 5 min and 
appeared to bear somewhat systematic but complex 
temporal relationships to variation in heart rate. 
More recently, these same authors (Harris, Gilliam, 
Findley, & Brady, 1973) have extended this basic in- 
strumental autonomic conditioning procedure with 
the baboon to produce more sustained and clinically 
relevant increases (30-40 mm Hg) in both systolic and 
diastolic blood pressure throughout daily 12-hour ex- 
perimental sessions, as illustrated in Figure 6. Signif- 
icantly, the maintained instrumentally conditioned 
blood pressure increases can be seen to be accom- 
panied by elevated but progressively decreasing heart 
rate levels. 


Other Response Systems 


Instrumentally conditioned glandular response 
changes have also been demonstrated in an experi- 
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ment with curarized rats rewarded by electrical brain 
stimulation for both increases and decreases in the 
rate of urine formation by the kidney (Miller & D1- 
Cara, 1968). Using insulin-*C and _ tritiated p- 
aminohippuric acid (PAH), it was also determined 
that both glomerular filtration rate and renal blood 
flow were systematically altered by the operant cond1- 
tioning procedure, though heart rate, blood pressure, 
and peripheral blood flow were not, confirming the 
high degree of specificity and localization of action 
emphasized in previous reports (DiCara & Miller, 
1968b, 1968c; Fields, 1970; Miller & Banuazizi, 1968). 
More recently, Banuazizi (1972) has extended the ex- 
perimental analysis of selectively conditioned intesti- 
nal contractions in the curarized rat by demonstrat- 
ing discriminative stimulus control of both short- and 
long-duration intestinal responses reinforced by shock 
avoidance. Significantly, the study also included a 
control for the unconditioned effects of the electric 
shock as a possible “‘state dependent” influence upon 
the contractile response, and confirmed the instru- 
mental intestinal conditioning under even these more 
stringent requirements. 


Feedback Factors 


The role of feedback stimulation in the establish- 
ment and maintenance of instrumentally conditioned 
visceral and autonomic response changes has been 
emphasized in several recent studies (Harris, Findley, 
& Brady, 1971; Harris, Gilliam, Findley, & Brady, 
1973; Hothersall & Brener, 1969; Lang, 1970) despite 
the somewhat equivocal status of the “‘interoceptive 
discrimination” issue as reflected in the literature of 
the past decade (Kadden, Snapper, Schoenfeld & Kop, 
1970; Kadden, Schoenfeld, & Snapper, 1970; Mandler 
& Kahn, 1960; Slucki, Adam, & Porter, 1965; Slucki, 
McCoy, & Porter, 1969). ‘The early and more recent 
studies by Miller and his students as reviewed above 
involved reinforcing stimulus changes (e.g., electric 
brain stimulation) which provided immediate feed- 
back following visceral or autonomic response varia- 
tions. As more extended duration instrumental auto- 
nomic conditioning effects have been investigated, 
however, exteroceptive stimuli (e.g., lights, tones) 
linked with the internal environment by advanced 
electrophysiological recording and amplification tech- 
niques (Brown & Thorne, 1964; Budzynski & Stoyva, 
1969; Hefferline & Keenan, 1963; Krausman, 1972) 
have provided both digital and analogue presenta- 
tions of critical interoceptive events and processes. In 
addition, such feedback stimuli serve as conditioned 


reinforcers bridging the temporal gap between the 
visceral response and its maintaining environmental 
consequences. Such stimulus feedback applications 
have been convincingly demonstrated in the instru- 
mental heart rate conditioning studies of Hothersall 
and Brener (1969) with curarized rats, and the oper- 
ant blood pressure conditioning experiments of Harris 
et al. (1971) with laboratory baboons. 


Interpretive and Theoretical Issues 


Despite this emergent operational orientation, 
interpretive and theoretical accounts of instrumental 
autonomic conditioning procedures and results con- 
tinue to focus upon “‘mediational” issues as these have 
been reviewed and discussed at length in several re- 
cent reports (Crider, Schwartz, & Schnidman, 1969; 
Katkin & Murray, 1968; Katkin, Murray, & Lachman, 
1969; Kimmel, 1967; Schoenfeld, 1970, 1971). ‘The 
controversies surrounding experimental attempts to 
control such ‘‘voluntary mediators” in instrumental 
autonomic conditioning studies have at least empha- 
sized the need to reexamine some basic formulations 
regarding conventional distinctions between the two 
types of learning (Schoenfeld, 1966, 1972). With re- 
spect to more focused concern involving the interrela- 
tionship between autonomic-visceral and somato- 
motor activity, two more or less distinguishable points 
of view can be identified. Miller, DiCara, and their 
associates (DiCara, 1970; DiCara & Miller, 1968a, 
1968b, 1968c; Miller & Banuazizi, 1968; Pappas, Di- 
Cara, & Miller, 1970; ‘Trowill, 1967) on the one hand, 
have appeared to take the position that evidence from 
their own experiments and those of others (Schwartz, 
Shapiro, & ‘Tursky, 1971; Shapiro, Tursky. & Schwartz, 
1970) supports the independence of somato-motor and 
autonomic-visceral control. Black (1967a, 1967b, 1968), 
Brener and Goesling (1968), and Obrist, Webb, Sut- 
terer, and Howard (1970), on the other hand, prefer 
to represent autonomic-visceral and somato-motor 
activities as two components of a more general, cen- 
trally controlled response process. That the dividing 
line between the two formulations may not be too 
firmly drawn, however, would seem to be indicated by 
the fact that virtually all the adherents to the latter 
school of thought (Black, 1971; Goesling & Brener, 
1972; Obrist et al., 1970) appear willing to concede 
that the postulated “normal” linkage between the two 
systems may be modified in a variety of ways. Indeed, 
a more moderate “separable but interacting” formula- 
tion (Brady, 1972) of the observed psychophysiological 
relationships may better serve the purposes of both 
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clinical and experimental investigators concerned 
with the conditions under which dissociation or de- 
coupling of the two systems can and do occur in the 
course of ongoing behavioral transactions between 
organism and environment. 

It is probably noteworthy that amidst this flurry 
of interpretive and theoretical discourse, recent re- 
ports by Miller and Dworkin (1974), DiCara (1974), 
and others (Brener, Eissenberg, & Middaugh, 1974) 
have raised searching questions about the replicability 
of earlier findings related to instrumentally condi- 
tioned heart rate changes. Specifically, there has been 
a progressive decline in the magnitude of learned 
changes in heart rate from the first experiment con- 
ducted in 1966 through those of the 1970's (Miller & 
Dworkin, 1974). A number of factors are being in- 
vestigated in order to provide an account of these 
“extraordinarily perplexing and vexing phenomena,” 
as Miller terms it, but no proposed explanation for 
the discrepancy has yet been confirmed. Much of the 
controversy focuses upon technological and methodo- 
logical details related to the proper use of curare and 
artificial respiration procedures. In this regard, 
Howard, Galosy, Gaebelein, and Obrist (1974) have 
enumerated several of the problems associated with 
curarization, including dosage, criteria for muscular 
blockade and artificial respiration, the side effects of 
ganglionic blockade, histamine release, and the alter- 
ation of sensory processes, 

Controversies regarding mechanism and methodol- 
ogy notwithstanding, the evidence that operant condi- 
tioning procedures, whether centrally or peripherally 
mediated, exert orderly and specific effects upon the 
functional properties of physiological systems seems 
incontrovertible, Clearly, systematic analysis of such 
noninvasive, nonpharmacologic influences upon s0- 
matic processes holds considerable promise for en- 
riching both clinical and experimental approaches to 
the physiopathology of health disorders. It is at least 
equally important that this work clearly confirms the 
importance of the contributions made by basic be- 
havioral science to a comprehensive physiology of the 
intact, unanesthetized, conscious organism. Indeed, 
increasing emphasis upon the application of biotele- 
metric techniques (McCutcheon, 1973) in_ psycho- 
physiological investigations provides tangible recog- 
nition of this developing frontier. Certainly, the 
identification and operational definition of such “in- 
ternal state’ variables and their functional interac- 
tions is of central importance in delineating a research 
domain which emphasizes the critical role of environ- 
mental-behavioral influences in the production of 
physiological alterations. 
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Procedures 


for the Acquisition 


A spoken word is a complex sound that can be 
produced by the vocal apparatus. As an auditory stim- 
ulus or as a piece of speech behavior, a word can be 
studied in the laboratory in ways that are similar to 
the investigation of other stimuli and behaviors. For 
example, reinforcing the production of a particular 
word in the presence of a particular stimulus object 
can increase the probability that the word will be 
produced in the presence of the same object or similar 
objects in the future. Employing a word as a dis- 
criminative stimulus in a particular context links it to 
the presence of a set of reinforcement contingencies. 
Skinner (1957) refers to these two situations as, re- 
spectively, “controlling” the word and “being con- 
trolled” by the word. Outside the behavior laboratory, 
we say that the word means something; the control- 
ling stimulus or controlled behavior involves or 1s 
related to the word’s referent, the concept it stands 
for. Individual words associated with “things” in the 
world become associated with the concepts which the 
things instantiate. Concepts usually are instantiated 
by many things that fall under them (e.g., the concept 
‘‘dog’’); proper names normally have only one instanti- 
ation as a concept (e.g., the concept “Victor Borge’). 
The term things is used in a general sense to include 
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objects, abstractions, relations, attributes, events, etc, 
‘This is just another way of stating the law of gen- 
cralization, When more than one organism shares the 
same word-concept association, a number of linguistic 
functions can be performed. A speaker can, by emit- 
ting the word, induce a change in the behavier ef the 
hearer which is appropriate to the concept the word 
stands for. For instancé, uttering the word blimp on a 
crowded streetcorner is usually followed by moot hear- 
ers looking upward. 

Uttering the word which stands for a particular 
concept can serve other functions besides alerting the 
hearer to the presence of an instantiation of that con- 
cept. Using the same word in the presence of different 
things may be used to indicate a class resemblance; 
i.e., the shared word facilitates generalization. Utter- 
ing the word which stands for one concept in the 
presence of a thing which instantiates a clearly differ- 
ent concept may induce the hearer to discover a con- 
nection between the concepts. Even a metaphoric 
connection might be communicated in which the 
speaker indicates that two different things are the 
same in some special sense. With information derived 
from appropriate nonlinguistic context, uttering a 
word can serve as a request for the thing which in- 
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stantiates the word’s concept or for something con- 
nected with it, as an offer to produce the thing, or as 
a proposal that the speaker and hearer become in- 
volved with or enact the thing. 

A set of words each associated with a concept is a 
lexicon, With a lexicon, speakers can name, mention, 
or raise the topic of the concept(s) which each in- 
dividual word stands for. The acquisition of a lexicon, 
the production of and response to individual words, 
cloes not seem to be an impossible extrapolation from 
laboratory studies of operants and discriminative stim- 
uli. Critics (e.g., Chomsky, 1959) of the learning the- 
orist’s approach to language do not direct their main 
criticisms against explanations of the acquisition of a 
lexicon in and of itself. Nor is the learning of lexicons 
considered to be unique to the human species (see 
Segal, chapter 22 in this volume). 

However, even with the varied use of contexts the 
uttering of and the response to an individual word 
associated with a concept is only a small part of lin- 
guistic behavior. The speaker of a natural language 
not only mentions topics but can say something about 
them. To say something about a concept or its in- 
stantiation usually requires more than a single word 
—e.g., a word for an object and a word for its at- 
tribute. Not only must the words for the object and 
attribute be present in an utterance, but there must 
also be a way to inform the hearer that the object and 
attribute named by the words in an utterance are 
related. This information about relatedness makes the 
difference between a grammatical string and a list of 
words. A grammatical string, therefore, uses two 
information-carrying systems: 


I. Lexical_the individual words (more properly, 
lexemes) encode, stand for, signify the things and/ 
or concepts the message is about; and 

2. Syntaclic—a system which encodes, maps the rela- 
tionships among, the things and/or concepts as 
relationships among the words in the string. 


In English, conceptual relationships are encoded by 
the order of the words in a string and the use of in- 
flections (additions to and changes in the forms of 
words). 

A learning theoretic account of language must show 
how such a system of encoding relationships is ac- 
quired. ‘This is not a trivial problem. The difficulty 
can be more easily seen by comparing the real-world, 
natural-language situation to an artificial, very simpli- 
fied world and language. Imagine a population of or- 
ganisms that are rooted in place, each one in front of 
a narrow window. Through the window, a creature 
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can observe a small portion of a moving conveyer belt 
which passes by the windows of each of them. On the 
belt at irregular intervals appear objects of a limited 
number of types. Presupposing some survival value in 
being able to anticipate the appearance of particular 
kinds of objects, imagine that the creatures have a 
lexicon which enables them to name each of the ob- 
jects on the conveyer belt. By calling out these names, 
a creature can apprise her downstream neighbor of 
what will appear in the window and when. The only 
data which the language must encode are the type of 
each object and its temporal appearance. The tem- 
poral order and proximity of the objects are directly 
mapped by the temporal order and proximity of the 
uttered names. The relationship in time among ob- 
jects on the conveyer belt is exactly the same as the 
relationship in time among the words which name 
the objects. The “syntax” of this language maps the 
temporal order and spacing of objects on the con- 


veyer belt as the temporal order and spacing of the 


words which stand for these objects. Such a mapping 
is called “iconic” (Peirce, 1931). It hardly presents a 
problem for the learning theorist; the system requires 
no learning beyond the lexicon. 

The difficulties arise when we leave the simple 
world of the conveyer belt creatures. In natural lan- 
guage, the word symbols of speech still appear one 
after the other in a linear temporal sequence, but the 
relationships among concepts and things communi- 
cated by natural-language speakers are much more 
varied and complicated. Somehow, the linear se- 
quence must contain sufficient information to enable 
the hearer to determine how the individual words 
work together to encode the relationships among con- 
cepts and/or things. Encoding these relationships by 
means of a linear sequence of individual symbols re- 
quires that there be a pattern to the appearance of 
the symbols. In the pattern must be the syntactic in- 
formation from which the hearer infers the way in 
which the concepts and/or things named by the in- 
dividual words are related. 

The simplest kind of syntactic information signals 
the hearer that a sequence of words should be treated 
as a linguistic unit. Even without any other syntactic 
information, this can be done by, for example, tem- 
porally grouping a sequence of words by placing a 
relatively long pause at its beginning and end. The 
information that the individual words in a sequence 
are to be taken as a linguistic unit is adequate for the 
complete understanding of many strings so long as 
the hearer has learned the meanings of the individual 
words and has adequate nonlinguistic experience. For 
example, no additional syntactic information is neces- 
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sary to understand the string “Pregnant deer nibble 
tree roots.’ Even if the word order is scrambled, 
thereby destroying the syntactic information, knowing 
that the words function together as a linguistic unit is 
enough. There is only one state in the mundane 
world whose description can be derived by consider- 
ing these words as a linguistic unit. In a science- 
fiction tale of carnivorous trees in ecological competi- 
tion with Cervidae, two interpretations are possible. 

Even without such bizarre situations, the syntactic 
datum that the words of a string “go together” 1s 
often, perhaps usually, inadequate by itself to under- 
stand a sentence unambiguously. More information 
from the pattern of the word sequence is necessary. 
Consider the simple sentence “The fast girl pushes 
the fat boy.” Unambiguous interpretation could not 
be made without using the syntactic information 
which is derived from the order of the individual 
words in the sentence. If the order is scrambled, this 
interpretation would be lost: “The fast boy pushes 
the fat girl’; “The girl pushes the fat boy fast”; etc 
As we will sec later, the most essential syntactic in- 
formation is that which enables the hearer to deter- 
mine which words form a subunit which in turn 
works together with other subunits in the string. 

Skinner (1957) suggested that the language user can 
learn to discriminate the grammatical categories of 
individual words, The series of grammatical cate- 
gories of the words in a sentence form a pattern which 
serves as a complex discriminative stimulus control- 
ling the response to the set of words in the sentence. 
Skinner called these patterns of grammatical categories 
“autoclitics.” In the presence of (i.e., appearing in the 
format of) a particular autoclitic, the individual words 
are taken to be related in a particular sort of way. For 
example, in ‘Alice gives Susan the wrench,” the auto- 
clitic “proper noun-verh-proper noun-article—noun” 
signals that the first proper noun is the agent, the 
verb is the activity, etc. Along with intonation 
cues or punctuation, the pattern also indicates the 
mode of the sentence, whether declarative, interroga- 
tive, tc, 

Although the parallel with laboratory paradigms is 
suggestive, Skinner was not clear about how the gram- 
matical patterns and their significance could be 
learned. There are complicating factors which make 
this approach unwieldy. For example, identifying the 
pattern of grammatical categories of a string of words 
cannot be done by treating the words ‘one at a time. 
The category to which a word belongs often depends 
on its relationship to other words in the sentence, 
including those which are not adjacent to the word 
in question. Consider the difficulty in identifying the 
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autoclitic pattern of grammatical categories upon 
hearing the following sentence: “With a beat beat the 
beat beat with a beet” (Rhythmically strike the ex- 
hausted hippie with a red vegetable). Since a pro- 
cedure for assigning grammatical categories to each of 
the words could not succeed by treating each word in 
isolation, the language user must consider the rela- 
tionship each word has with other words—i.e., what it 
does with and to other words. That is, the language 
user must perform some kind of analysis on the sen- 
tence; syntactic information is not available as an 
autoclitic cue by taking the sentence as an unanalyzed 
whole. 

There is another argument, based on what linguists 
call the generative property of language, that the 
erammatical category sequence of a sentence taken as 
a whole could not always be available as an auto- 
clitic: language users can process (i.e., construct and 
interpret appropriately) sentences whose grammatical 
sequences (or similar ones) they have not been ex- 
posed to before. Considering the sequence of gram- 
matical categories of the words of the entire sentence 
as an autoclitic cue, generalization frem previously 
experienced sentences is usually inadequate to explain 
the processing of novel sentences. The dimensions of 
similarity along which generalization occurs are im- 
possible to specify for unanalyzed whole sentences. 
But the capacity to process novel sentences can be ex- 
plained if we assume that the language user is able 
to combine and permute romponents of grammatical 
sequences. 

The advantages of a generative. combining system 
can be appreciated by considering the problem of 
(ramming an organism to emit only a restricted subset 
of a set of possible sequential behavior patterns. Im- 
aging 4 row Of six levers exch of which can be pushed 
up or down from the newtral position, We arranges the 
system so that the set of lever patterns leading to rein: 
forcement (call ie “set A”) is defined, for example, as 
follows: the first lever must be up and somewhere in 
the sequence an adjacent pair of levers must be up. 
‘There are 94 different patterns in sét A out of A total 
of 64 possible patterns altogether. To encourage the 
organism to learn the entire set, we arrange the rein- 
forcement schedule so that any particular pattern 1s 
reinforced only once every 24 reinforced trials. “The 
experiment may be terminated after 24 reinforced 
trials. 

A subject treating the lever sequence as a set of 
whole patterns, even one with a perfect memory (€.g., 
a scratchpad), would need to produce all 64 patterns 
in order to be sure of having determined all the ones 
belonging to set A. A picture of the whole-pattern 
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learner’s information would be a list of 24 sextuples: 
UUDDUD, UDDUUD, UUUDDU, etc. If you try 
this experiment as a paper-and-pencil task, you will 
find that very few subjects require experience with all 
64 patterns to be able to produce all and only those of 
the reinforced set. In fact, most subjects will be able 
to list every one of the members of set A before hav- 
ing reinforcement experience with all of them. 

What the subjects learn is something about the 
way in which components of the lever position se- 
quences go together. It is something which is not itself 
one of, or all of, the members of set A. What they 
learn might be pictured something like this: 

U 
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It is an algorithm, or generating principle, for all and 
only the sequences of set A. Not only can such an 
algorithm be learned without the subject having ex- 
perience with every member of set A, but the al- 
gorithm is very much easier to remember than the 
list of 24 sextuples. 

Of subjects who are able to produce nonrandomly 
the remaining members of set A without having prior 
reinforcement experience with them, we must infer 
that they are using a generating algorithm. Many sub- 
jects, in fact, are able to describe such an algorithm. 
A traditional account involving generalization from 
previously learned to novel lever sequences requires 
the dimensions of similarity between new and old 
sequences to be specified. A description of the simi- 
larity turns out to be a restatement of the algorithm. 

The algorithm can be called an “underlying struc- 
ture.” Skinner and other behaviorists have been reluc- 
tant to allow explanations involving underlying struc- 
tures which include more than input and output 
“surface” data. On the other hand, linguists and some 
psychologists have readily adopted underlying struc- 
tures as formal description of language data while 
disregarding the question of how such structures, gen- 
erating algorithms, might be acquired by the princi- 
ples of learning. In fact, many linguists, because they 
believe such structures are innate and could not pos- 
sibly be learned, assume that the learning theorist 
should not have an interest in underlying language 
structure. Lakoff (1973) critically summarizes the ar- 
guments by some of his fellow linguists (Chomsky in 
particular) that underlying linguistic structure should 
not be in the domain of the learning theorist: 
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There are at present no general learning 
theories that can account for [linguistic struc- 
ture]. It is hard to imagine what any such 
theories could be like. Therefore, it is plausible 
to assume that there can be no such theories. 
But the argument is fallacious: Nothing follows 
from a lack of imagination. 


The set of grammatically correct sentences of a 
natural language is somewhat analogous to set A in 
the lever sequence experiment. The language user 
must be capable of generating and recognizing se- 
quences of words that are well-formed sentences in 
the language. Because the language user, analogous to 
the lever sequence subject, can generate and identify 
more sentences than have previously been experi- 
enced, some sort of underlying information structure 
—a generating algorithm—must be involved. As men- 
tioned earlier, the pattern in which the words of a 
sentence are arranged encodes information about the 
ways in which the things and/or concepts indicated 
by the words are said to be related. This encoding 
system is syntax. Since language users employ the syn- 
tax encoding system to process sentences they have not 
previously experienced, syntax must, like generative 
capacity per se, involve the underlying information 
structure. 

We can describe the lever sequence generating 
structure as being used by the subject to parse the 
reinforcement contingencies into those of set A and 
those not of set d. Natural language syntax performs 
an analogous task. 

The following section demonstrates that the learn- 
ing and use of the underlying linguistic information 
structure need not be considered a mysterious process. 
A very simple learning mechanism, controlled by lin- 
guistic input and reinforcement, shows how the un- 
derlying structure could be acquired and used. The 
discussion is based on the syntax crystal model (Block, 
Moulton, & Robinson, 1974, 1975: Robinson & Moul- 
ton, 1972). 

‘The task is threefold. First, the syntactic model must 
show how relationships among concepts can be 
mapped onto relationships among words—i.e., order 
and inflections. Such a mapping must enable the lan- 
guage user to generate a sentence from which another 
language user can determine the way in which the 
concepts named by the words go together. How the 
concepts go together is what is meant by the seman- 
tics of the string. ‘The model must show how and 
under what conditions an organism can learn to per- 
form this reversible mapping operation. Finally, the 
psychological operations required should be plausible 


George Robinson 


and clear. To the extent that these demands are satis- 
fied, we shall have demonstrated that syntactic struc- 
ture is accessible to the learning theoretic approach. 

Consider the set of concepts, discussed earlier, 
named by the following words: girl, boy, fast, fat, 
push(es). ‘These concepts can combine with each other 
in several ontologically coherent ways. For example, if 
our language user observes the intramural sprint 
champion of Wellesley giving her overweight brother 
a swing ride, the concepts are connected as in Figure 
1A. 

To apprise another of the essentials of the scene, 
the language user must transmit the names of the 
concepts and the set of relationships portrayed by 
connections 1-4. This particular set of relationships 15 
transmitted by ordering the words which name the 
concepts as follows: “Fast girl push(es) fat boy.” The 
syntactic model must map the set of brackets into the 
proper sequence of words. Further, the hearer must 
be able to determine the correct set of conceptual re- 
lationships from the order of the words. The reversi- 
ble mapping is portrayed in Figure 1B. ‘The connec- 
tions are numbered to show their correspondence to 
those in Figure 1A. We think of the upper diagram 
as an orrery (planetarium mobile), and the job of 
syntax is revolving the units around the swivel joints 
(the dots) until they are in the proper left-right rela- 
tion to one another. In essence, the syntactic model 


Fig. 1. Conceptual structure underlying a sentence portrayed 
by the orrery model. A: Unordered concepts are connected 
according to their scope and dependency relations. B: Using 
syntactic information, the same connections are rotated to yield 
the word order of the corresponding sentence. 
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must constrain the correlation of word order and con- 
ceptual relations to those of the set of well-formed 
English sentences. 

‘The syntax crystal is described in terms of a two- 
dimensional crystal built up of rectangular units 
which join together under the control of “connection 
codes” on their edges. ‘Iwo rectangles can join to- 
gether, like dominoes, if their apposing edges bear the 
same code. ‘The rectangular “cards” are originally 
blank and completely undifferentiated. Words and 
connection codes are entered, reinforced, and extin- 
guished during the language-acquisition process. Upon 
completion, the units of the crystal are so differenti- 
ated that they can only be assembled in structures 
which correctly correlate the word order and con- 
ceptual relationships of well-formed English sentences, 
The body of any completed crystal actually takes the 
shape of the correlation map of the sentence (like that 
shown in Figure 18). 

We assume that syntax acquisition requires a lin- 
guistically competent speaker (Parent) who talks with 
the language learner (Child) about aspects of the 
world which the Child can comprehend. We further 
assume that the Child has learned the meanings of 
some individual words. Io begin the process, the 
Parent might stride about the room and say te the 
Child: “Parent walks.” All the Child need do is recog- 
nize that there is some relationship in the world be- 
tween what it knows to be the referents of “parent” 
and “walks.” It doesn’t need to distinguish this rela- 
tionship as, for instance, “actor-activity.” It only needs 
to detect that a relationship, any relationship, exists. 
In this case, a description of thé data uséd to «letéct 
the presence of a relationship might be that at the 
occurrence of the utterance “Parent walks,” the things 
namsd by the two words are simultancously yery 
salient. The Gestalt “law of common fate’ could be 
invoked here. Whenever the Child deteets that two 
things spoken of are related, it forms a connection 
between the words of the accompanying utterance 
which name those things, The conceptual connection, 
derived from the Child’s observation of the environ- 
ment, 1s combined with the order of the words derived 
from the utterance (see Figure ZA). The syntax crystal 
models this as shown in Figure 2B. The words are en- 
tered on the bottom of adjacent blank cards and the 
connection pathway is formed by entering the (arbi- 
trary) codes R, R, A, A, 5S, S, on apposing edges of the 
word cards and two additional blank cards as shown. 
If separated, these four cards can combine with one 
another in only one way. Each pair of identical let- 
ters On apposing edges of two cards symbolizes a 
learned connection. Since the word cards are not con- 
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C 
D 


, ee. | 
+ 
s T 
WALKS FAST 


Fig. 2. The first of a set of syntax crystal cards generated 
according to the learning algorithm. Conceptual relations and 
word order determine the construction of underlying syntactic 
structure. See text for details. 


nected directly to one another, the connection be- 
tween word cards is a mediated one. The mediation is 
abstract and internal—the information on the two 
upper cards has no relationship to the linguistic or 
nonlinguistic input other than representing the lan- 
guage learner’s detection of a conceptual connection 
between “parent” and “walks.” 

Now, perhaps the Parent, observing the Child 
walking, says: “Baby walks.” The Child, noticing that 
the concepts named by the two words are related, 
forms a connection between “baby” and “walks.” To 
rely as much as possible on past learning, the Child 
uses as many of the previously established cards as 
possible—namely, the three attached cards shown in 
Figure 2C—and adds the connection code “R” onto 
the “baby” card, as shown in Figure 2D. 

For more complicated strings, “Parent walks fast’ 
and “Baby walks fast,’ the Child must realize that 
there is some connection between the concepts ex- 
pressed by “walks” and “fast.” It doesn’t have to know 
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that such a connection is ‘‘predicate plus qualifier.” It 
just needs to recognize that there is some state of the 
world that obtains when “walks” and “fast” are used 
together which does not obtain when they are used 
separately. To connect the “walks” card to the new 
“fast” card, the Child constructs the set of cards 
shown in Figure 2E and joins it to the previously 
created structure as shown in Figure 2F. “Parent” is 
connected to the entire phrase “walk(s) fast” and not 
Just to the verb “walk(s).” 

Placing parentheses around a connection code in- 
dicates that the connection is optional. An optional 
code is one which may be left unconnected in a com- 
pleted crystal. This permits “Walk fast” to be con- 
structed using existing cards as a result of the Parent 
saying “Walk fast” to the Child. 

I'wo or more connected words which are in turn 
connected to another part of the crystal structure by 
a single card (e.g., the card in Figure 2G) form a “con- 
stituent.” (A single word may also be a constituent.) 
If a word or phrase can be substituted for a particular 
constituent and the result is a well-formed utterance, 
the new unit is connected by the same code as the 
constituent it replaces. For example, the set of cards 
shown in Figure 2H could be substituted for that in 
Figure 21; so it, too, is given the (S) top code. 

Feedback as to whether the substitution of a con- 
stituent results in a well-formed utterance comes from 
the response of the Parent. In most cases the response 
informs the Child whether its trial utterance is ac- 
ceptable. Usually such feedback confounds syntactic, 
semantic, and possibly stylistic acceptability; parents 
tend to withhold positive responses when their chil- 
dren’s utterances are ungrammatical, nonsensical, im- 
polite, or any combination of these faults. In addition, 
criteria of acceptability vary widely across parents and 
the age of the child. The important point is that the 
linguistic knowledge symbolized by connections in the 
syntax crystal is shaped by parental responses: a coded 
connection is maintained when its use results in an 
acceptable utterance; a connection is eliminated when 
its use results in an unacceptable utterance. 

This “on-or-off” connection code is not a necessary 
feature and is used here for simplicity. Alternatively, 
we can suppose that the “‘strength” of each code varies 
as a function of how frequently its use results in an 
acceptable utterance. Then if the use of a new code 
results in unacceptable utterances, extinction will 
lower the strength of the new code below threshold 
while the stronger, more frequently used codes re- 
main active. 

We can now outline in brief a “strict learning pro- 
cedure” (as distinct from the “strict training proce- 
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dure” of Premack, 1970) which constructs a set of 
syntax crystal cards representing underlying linguistic 
structure. 


I. Basic Connection. 


A. For a two-word string or a longer string in 
which the Child only knows the meanings 
of two words: 


1. Place two blank cards with short edges 
adjacent. 


2. Enter the first word on the left card, the 
second word on the right card. 


3. Place two blank cards above the word 
cards and assign them arbitrary but dis- 
tinct matching codes that connect them 
to one another and to the word cards. 


B. For sentences where three or more words 
are known to name things which are con- 
ceptually connected: Connect two adjacent 
words and then connect the block of four 
cards so formed to the third word and then 
that block to the fourth word and so on. 
Which two words are first connected to- 
gether as a subunit is determined by the 
way in which the concepts named by the 
words are related. The most important 
featuré of the relationship is the scope of 
each word—i.e., which other words name 
concepts that are operated on by the con- 
cept named by the word in question, An 
example shows what is involved: “Big 
parent walks.” The scope of “big” 1s “par- 
ent,’ while the scope of “walks” is “big 
parent.” Therefore, the syntax crystal con- 
nects “big” to “parent” and then connects 
that pair as a unit to “walks.” There are 
many details about scope which are omitted 
from this discussion. Well-formed sentences 
can still be processed by a syntax crystal 
without these details, but some power to 
unambiguously encode complex conceptual 
relations may be lost. The interested reader 
should consult the references. 


Il. Substitution of Constituents. 


A. If replacing a previously connected constit- 
uent by a new word or string results in an 
acceptable sentence, the new “candidate 
constituent” receives at its attachment edge 
the same code as the one for which it is 
substitutable. Existing connection codes 
should be tried first. If assigning existing 
codes results in the production of unaccept- 
able sentences, the candidate constituent 
is connected with a new code. As the need 
for finer syntactic distinctions develops, 
some previously acceptable substitutions 
may result in the production of unaccept- 
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able utterances. In these cases, the old con- 
nection codes which are shared by other 
constitutions are replaced by new, unique 
codes, as in I above. 

B. Whenever a unique side code (on a short 
edge of a card) is assigned, the bottom code 
of that card must also be changed in order 
to make it unique. 


III. Optionally Connectable Codes. 
Whenever an acceptable string can result from 
leaving a coded edge unconnected, the code is 
made optional by placing parentheses around 
it. 


Here follows an illustration of these procedures to 
generate a syntax crystal. For heuristic purposes, the 
connection code letters suggest the conventional gram- 
matical categories: N for noun, etc. Finer distinctions 
within a category are made by subscripting the letters 
N,, No, ete. 

Starting with the string “Big parent,” the learning 
procedure generates four cards (reading the card edges 
clockwise): 


Card Bottom Left Top Right 
1. big = Js ad 
2. Ji =e = Ji 
3. N, 16 - = 
A, parent — Ny — 


The codes on cards 1 and 4 must be different from 
each other by rule ITA because “big big” and “parent 
parent” are unacceptable strings. 

A second input string, “Parent walks,’ results in: 


5. Ny — mma Sy 
6. Vv S: = = 
7. walk(s) = Vi _ 


and card 4 will have the code N added to the top as 
required by rule ITB: 


4. parent a Ny. 2 = 


A novel string can be produced by adding (N,) to the 
top of card 3: 


3. N, Ja (No) ~ 


As described earlier, the connections are made with 
“big parent’ and “walks” as constituents which would 
eventually predominate over “big” and “parent walks.” 
The string “Walk slowly” results in: 


8 slowly — B, — 
9 B, B, = =X 
10 Ve — — B, 


and the addition of V2 to the top of card 7: 
ie walk(s) — Viz — 


“Parent walks slowly” can be produced by adding 
(V,) to the top of card 10: 


10. V2 — (V3) B, 

“Parent pushes” results in: 

ll. pushes —~ Vie — 
alter producing and receiving positive feedback for 
“Parent pushes slowly.” “Pushes” is at this time syn- 
tactically equivalent to “walks.” 

This equivalence breaks down with the string “Par- 
ent pushes baby.” At first, “baby” appears to be sub- 
stitutable for “slowly” in the previous string. But sub- 
stitution for “slowly” in “Parent walks slowly” fails 
(assuming the less usual transitive meaning of “walk” 


has not yet been learned). Therefore, “baby” receives 
a unique top code: 


2. baby — Ng — 
13. Ng Nz — — 


Also required is: 

14, V3 — Vi Ns 
and the modification of card 11: 

]1. pushes = Vi2,3 a 


The string “Push parent” allows the V, code of 
card 14 to be made optional: 


14. Vs —— (V;) Ng 
and requires that N; be added to the top of card 4: 
4. parent — Ni2.3 — 


The string “Walk toward baby” results in: 


15. toward — P, — 
16. P, — B, Ng 
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The string “Parent can push” results in: 


17. can — M, — 
18. M, cams Vi M, 
19. Vi M, = Z 


and card 11 will have V, added to its top: 
lI. push(es) — Vis — 


Only 10 input strings were provided. With the sub- 
stitution tests and modification on the basis of paren- 
tal reinforcement, the following 19 cards of a syntax 
crystal result: 


Card Bottom Left Top Right 
i; big — Ji — 
2. if - - ji 
3. N, Ji (No) — 
4, parent — Ni-3 — 
5. No — — Si 
6. V; S, = = 
7. walk(s) — Vi24 — 
8. slowly — B, — 
9. B, B, — — 

10. Vo — (Vi2.4) B, 
ll. push(es) — Vi-4 = 
12, baby — Ni-3 — 
13. Nz Ng = ” 
14. V3 = (Vi,2,4) Ng 
15. toward — P: — 
16. P, _ B, Ns 
17. can — M, — 
18. M, — Vi M, 
19 V4 M, — — 


With these 8 word cards and 11 “structure” cards, 
over 1,000 well-formed strings of nine words or less 
can be generated. Examples are: “Big parent can 
push slowly,” “Baby walks slowly toward parent.” 

Connecting word cards hierarchically through 
structure cards is not an arbitrary procedure. Struc- 
ture cards can be considered to arise from a kind of 
natural selection process. Think of the word cards to 
be connected as surrounded by a matrix of blank 
cards. Initially the language learner generates a (per- 
haps random) variety of paired codes on surrounding 
cards, connecting the two word cards through a num- 
ber of different pathways, direct word-word connec- 
tions, and mediated hierarchical connections. Through 
substitutivity of words and larger constituents, the 
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hierarchical pathways are more general and used more 
often compared with the direct word-word pathways 
which are specific to particular word sequences. Let 
the connections be strengthened as a function of use, 
perhaps supplemented by a “housecleaning” opera- 
tion which weakens or eliminates the infrequent con- 
nections. ‘This does not deny that some word-word 
connections remain—they are such stuff as clichés are 
made on. 

The syntax crystal is thus a system for acquiring 
and using underlying linguistic information. It repre- 
sents coordinately the relations among concepts and 
the sequence of words which encodes these relations. 
Linguists often test the adequacy of such representa- 
tions by applying them to ambiguous sentences to 
determine whether the representation is able to por- 
tray the difference between the multiple meanings. A 
famous stumbling block is a sentence like “Visiting 
professors can be dull,” It may be taken to suggest the 
possibility either that paying social calls on academics 
is dull or that academics from other universities are 
dull. ‘These two interpretations can be “read off” the 
two different crystal structures which generate the 
sentence. They are shown in Figure 3 in skeleton 
form. ‘The second half of the sentence, “can be dull,” 
is the same for both versions, but in the crystal in 
Figure 3A it is connected directly to “visiting,” with 
“professors” connected less directly as a qualifier. In 
the crystal in Figure 3B “can be dull” is connected 
directly to “professors,” with “visiting” connected less 
directly as a qualifier, 

Given any well-formed string, the syntax crystal 
can be used to determine the conceptual structure(s) 
underlying the sentence. Building the crystal up from 
the words until no more coded edges remain uncon- 


VISIT ING PROFESSORS CAN BE DULL 


VISIT ING PROFESSORS CAN BE DULL 


Fig. 3. Simplified skeleton crystal structures for two interpreta- 
tions of an ambiguous sentence. A: The second half is con- 
nected directly to “visiting,” and less directly to ‘Professors’ 
to provide one meaning. B: The reverse arrangement provides 
a different meaning. 
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nected produces a structure (like those illustrated in 
Skeleton in Figure 3) which shows how the various 
constituent units go together. This information, the 
parse, enables the hearer to infer the conceptual rela- 
tionships encoded by the syntax of the sentence. The 
reader interested in further details of the model 
should consult the references. 

To sum up, this discussion tries to describe the un- 
derlying structure of natural-language syntax in a 
manner congenial to the operant approach. Structure 
is crucially involved in understanding and generating 
sentences. Many linguists and psychologists view lan- 
guage structure as beyond the reach of learning theory 
and consider language acquisition to be a Specics- 
specific, innately controlled process. This chapter uses 
the recently developed syntax crystal model to show 
that the hierarchical structure which appears to un- 
derlie language can be acquired through learning. An 
iterative mediated association principle, contrelled by 
reinforcement, can produce such structure without 
assuming innate linguistic organization. 

The model in its present stage of development con- 
stitutes a logical argument that structure can develop 
by learning principles; we have yet to demonstrate 
that children do operate this way. There is much to 
be done, I hope that researchers in the field of learn- 
ing will be encouraged to consider the structure of 
language as falling within their domain. 
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Toward 


a Coherent Psychology 


of Language” 


TOWARD A COHERENT PSYCHOLOGY 
OF LANGUAGE 


In the vintage year 1957 two important theoretical 
treatises on language appeared, Chomsky’s Syntactic 
Structures and Skinner’s Verbal Behavior. In the 
years since, two rival camps have grown up in psychol- 
ogy, one following Chomsky and the other Skinner. 
The rivalry seems vain, however, for grammatical 
theory, especially in its more recent versions, rather 
complements the behavioral view of language than 
clashes with it. The plan of this chapter is to sketch 
one version of generative grammatical theory, the 
version of Chomsky’s (1965) Aspects of the Theory of 
Syntax, which is often called the “standard” generative- 
transformational theory of syntax. Then I will sketch 
Skinner’s (1957) theory of verbal behavior. Then I 
will try to show, in a general way, how the theories 
complement one another. 

Readers will learn little about the interesting re- 


*J thank Derek Hendry, for his sympathetic criticism of an 
earlier version of the manuscript, and Suzette Elgin, whose ex- 
pert criticism rid the manuscript of a few (surely not all) of its 
more glaring linguistic solecisms. 
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search on language and verbal behavior that psycho- 
linguists and some behaviorists have been doing, but 
perhaps they will find that a wish to learn more has 
been awakened and that they are open to what each 
group has to offer. Throughout, I shall speak inter- 
changeably of psycholinguists, generative grammari- 
ans, and cognitive theorists, although the terms are 
not exactly coextensive. As I see it, the two broad 
divisions that cut across several disciplines are cogni- 
tive theory and functional theory. Among cognitive 
theorists I count generative grammarians, psycholin- 
guists, mentalists, nativists, Gestaltists, information 
theorists, memory theorists, and so on. Among func- 
tional theorists I count the many varieties of behav- 
lorists and learning theorists as well as students of 
verbal learning in the traditional functional camp 
(Hilgard & Bower, 1975). 


COMPETENCE:PERFORMANCE::STRUCTURE: 
FUNCTION 


At places in his writings Chomsky claims to be aim- 
ing only for an economical but comprehensive formal 
(structural) description of the sentences a language 
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may contain. ‘This, when reached, would be a theory 
of “competence.” Ideally, it would be general enough 
to apply to all natural (human) languages, and not 
Just to one language. At other places Chomsky sug- 
gests a special relevance of grammatical theory for 
psychology—that is, that grammatical theory shows 
the way to a theory of “performance.” The least rel- 
evance a competence theory would have is that it 
would prescribe part of the scope that an adequate 
functional theory must have, the grammatical facts 
that psychological mechanisms must explain (Catania, 
1972, 1973). Beyond this, the formal theory of gen- 
erative grammar could offer more; its proposals about 
deep and surface structures and the formal rules that 
(deductively) generate each of these may (not “must”’) 
be adopted as hypotheses about psychological reality, 
about the psychological machinery that (functionally) 
generates utterances. Psycholinguists have treated 
grammatical theory in this second way; they have 
adopted one or another current version of generative 
theory as a theory of verbal performance (productive 
and receptive), 

The value of the theory as a linguistic formalism 
(as a theory of competence) of course does not depend 
on its utility as a theory of verbal behavior (a theory 
of performance). Whether generative grammar turns 
out to be a useful psychological theory will be an- 
swered in the course of time. I will not evaluate that 
question, nor will I survey the various competing 
varlants of generative grammar that are the daily con- 
cern of linguists. The question I address is this: as a 
theory of performance, what relation does the theory 
of generative grammar (as exemplified in Chomsky’s 
1965 theory) bear to a behavioral, functional theory 
(as extmplified in Skinner’s 1957 theory)? The sur- 
prising conclusion T have come to is.that the theories 
aré In part complementary, in the sense that they 
deal with different but not conflicting problems, and 
iN part isomorphic, in the sense that when they do 
address the same problems, they propose (roughly) the 
same answers. ‘This broad compatibility (neglecting 
differences in detail) has gone unnoticed because of 
the different vocabularies in which the theories are 
couched. As it turns out, mentalists and behaviorists 
are talking about much the same things. 


CHOMSKY’S STANDARD THEORY OF 
TRANSFORMATIONAL-GENERATIVE 
GRAMMAR 


A generative grammar must be a system of 
rules that can iterate to generate an indefinitely 
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large number of structures. This system of rules 
can be analyzed into the three major compo- 
nents of a generative grammar: the syntactic, 
phonological, and semantic components. 

The syntactic component specifies an infinite 
set of abstract formal objects, each of which 
incorporates all the information relevant to a 
single interpretation of a particular sentence. 
Since I shall be concerned here only with the 
syntactic component, I shall use the term “‘sen- 
tence” to refer to strings of formatives rather 
than to strings of phones. ... A string of for- 
matives specifics a string of phones uniquely 
. . . but not conversely, 

The phonological component . . . determines 
the phonetic form of a sentence generated by the 
syntactic rules... The semantic component 
detcrmines the semantic interpretation of a sen- 
tence, . . . The syntactic component of a gram- 
mar must specify, for each sentence, a deep 
structure that determines its semantic interpre. 
tation and a surface structure that determines its 
phonetic interpretation. . . . 

The central idea of transformational oram- 
mar is that [deep and surface structures] are, in 
general, distinct and that the surface structure 
is determined by repeated application of certain 
formal operations called “grammatical transfor- 
mations” to objects of a more elementary sort. 
... Lhe syntactic component must generate 
deep and surface structures, for each sentence, 
and must interrelate them. . 

The base of the syntactic component is a sys- 
tem of rules that generate a highly restricted 
(perhaps finite) set of basze strings, each with an 
associated structural description called a base 
Phrase-marker. These base Phrase:markers are 
the elementary units of which deep structures 
are constituted. ... Underlying each sentence 
of the language there is a sequence of base 
Phrase-markers, each géiiérated by ihe base of 
the syntactic component, I shall refer to this 
sequence as the basis of the sentence that it un- 
derlics. 

In addition to its base, the syntactic compo- 
nent of a generative grammar contains a trans- 
formational subcomponent. This is concerned 
with generating a sentence, with its surface struc- 
ture, from its basis. .. . 

Since the base generates only a restricted set 
of base Phrase-markers, most sentences will have 
a sequence of such objects as an underlying 
basis. Among the sentences with a single base 
Phrase-marker as basis, we can delimit a proper 
subset called ‘kernel sentences.” These are sen- 
tences of a particularly simple sort that involve 
a minimum of transformational apparatus in 
their generation. . . . One must be careful not 
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to confuse kernel sentences with the basic strings 

that underlie them. 

A grammar that generates simple Phrase- 
markers . . . may be based on a vocabulary of 
symbols that includes both formatives (the, boy, 
etc.) and category symbols (S, NP, V, etc.). The 
formatives, furthermore, can be subdivided into 
lexical items (sincerity, boy) and grammatical 
items (Perfect, Possessive, etc. ; . . .). (Chomsky, 
1965, pp. 15-18, 65—italics his) 

In summary, the “base” in Chomsky’s theory is a 
set of “phrase structure” or “rewriting” rules that 
generate “‘base phrase markers.” (I will give an exam- 
ple in a moment.) The set of all base phrase markers 
underlying a sentence are together called the “deep 
structure” of that sentence. ‘The base phrase markers 
of a language are “highly restricted (perhaps finite).” 
However, transformational rules applied to single 
base phrase markers or to sets of base phrase markers 
allow for the generation of an infinite variety of ob- 
served surface structures in a language. This is an 
important point, for Chomsky has often argued (e.g., 
Chomsky, 1959) that a theory of verbal performance, 
just as a theory of verbal competence, must account 
for the infinite number of sentences that can occur in 
a language, and that associationistic stimulus-response 
theories, in principle, cannot do so, for they lack the 
requisite transformational rules that permit a speaker 
to generate a variety of surface sentences from a sin- 
gle basis. (But see Robinson, chapter 21 in this vol- 
ume and later sections of this chapter.) 


Deep Structures 


The phrase structure rewriting rules that generate 
the restricted set of base phrase markers incorporate 
basic grammatical relations like subject-predicate, 
verb-object, and so on and describe the hierarchical 
arrangement of these grammatical categories. Here 1s 
an example, from Chomsky (1965, pp. 106-107), of 
some phrase structure rewriting rules: 


i. S— NP Predicate phrase 
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In each case, the term to the left of the arrow can 
be rewritten as the terms to the right of the arrow. 
Brackets indicate that the term on the left can be 
rewritten as any one of the sets of terms within the 
brackets on the right, and parentheses indicate that 
the parenthetical term is optional. When these rewrit- 
ing rules are interpreted graphically, the result is a 
“tree diagram” that shows diagramatically the hier- 
archical structure of a base phrase marker. For exam- 
ple, the rules given above would generate, among oth- 
ers, this tree diagram: 


eT ees 
NP PREDICATE PHRASE 
eee 


DET N s' AUX vP 
| 
PRE-ART OF ART POST-ART A A. TENSE V MANNER 
| 
PAST 


The diagram shows that a sentence (S) is constituted 
of two main parts (“immediate constituents’): noun 
phrase (NP) and predicate phrase. ‘These two main 
constituents are themselves constituted of immediate 
constituents: for the noun phrase, Det (determiner), 
N (noun), S’ (embedded sentence), and for the predi- 
cate phrase, Aux (auxiliary) and VP (verb phrase). At 
the next level (‘““up” or “down” makes no difference) 
in the phrase structure hierarchy, the determiner is 
shown to consist of the immediate constituents pre- 
article of, article, and postarticle; the embedded sen- 
tence (S’) is subdivided into two “dummy” parts to 
show that it, like all sentences, consists at the least of 
a noun phrase and a predicate phrase; the auxiliary 
in this case is rewritten as its single immediate con- 
stituent, tense; and the verb phrase is subdivided into 
its immediate constituents (V: verb; Manner: adverb 
of manner). Except for the ‘lexical formative” of and 
the ‘grammatical formatives” Past and S’, the tree 
diagram shows no “terminal symbols’ but only “‘pre- 
terminal category symbols,’ and it is therefore un- 
finished. Further rewriting rules would rewrite all the 


ii. Predicate phrase — Aux VP (Place) (Time) 


ili. VP — (Copula Predicate 


V ((NP) (Prep-Phrase) (Prep-Phrase) (Manner) 
S’ 


Predicate 
iv. Predicate > (el 


(like) Predicate-Nominal 


vil. NP > (Det) N (S’) 


xvi. Aux — Tense (M) (Aspect) 


xvii. Det — (pre-Article of) Article (post-Article) 
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preterminal symbols into terminal formatives, yield- 
ing such basic strings as: 


la. Many of the young chimpanzees (S’) (Past) learn 
quickly. 

Ib. One of the clever chimpanzees (S’) (Past) sign 
fluently. 


Ic. Several of the verbal chimpanzees (S’) (Past) lie 
shamelessly. 


Surface Structures 


One must distinguish between basic strings (such 
as the strings of formatives la, Ib, Ic) and their as- 
sociated structural descriptions or base phrase markers. 
The unfinished tree diagram given above shows part 
of the structural description of each of the basic 
strings in diagramatic form. (An alternative method 
of displaying these phrase markers is “labeled bracket- 
ing” of the string.) The distinction is important be- 
cause transformations apply to the base phrase mark- 
ers and not directly to the basic strings. The basic 
strings, as such, do not show the “derivational history” 
of the strings, but the hierarchical base phrase mark- 
ers do show exactly the derivational history by which 
the strings were generated by phrase structure rules 
from the initial symbol S. This derivational history 
(or hierarchical structure of the string) determines the 
transformations that will apply to the deep structure 
to generate a surface structure. 

The basic strings of set 1 still contain the symbol 
o’, standing for an embedded sentence. That means 
that the analyses of these deep structures are still in- 
complete. Another base phrase marker would need 
to be generated by phrase structure rules to yield the 
structural description of each of these embedded sen- 
tences, Suppose the additional base phrase markers 
were associated with these basic strings of terminal 
formatives: 


aa. (That) Rumbaugh (Past) train chimpanzees. 
2b. (That) The Gardners (Past) raise chimpanzccs. 
2c. (That) Chimpanzees (Past) talk to Premack. 


Then the two base phrase markers associated with 
strings la and 2a would together make up the deep 
structure of surface string 3a (see below); base phrase 
markers associated with strings Ib and 2b would 
make up the deep structure of surface string 3b; and 
sO on. 

Given the deep structures, transformational rules 
would now be applied to generate surface structures. 
Some transformational rules apply to single base phrase 
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markers. [For example, simple transformational rules 
would transform (Past) learn,” “(Past) sign,” (Past) 
lie” and so on into “learned,” “signed,” “lied,”’ and so 
on.| If the phrase markers associated with set 1 did 
not contain embedded Ss, and if the phrase markers 
associated with set 2 did not contain symbols [such as 
“(That)”] marking their status as embedded, then 
these transformations on individual base phrase mark- 
ers could complete the generation of six simple kernel 
sentences: 


3a. Many of the young chimpanzees learned quickly. 
3b. One of the clever chimpanzees signed fluently. 

3c. Several of the verbal chimpanzees licd shamelessly, 
3d. Rumbaugh trained chimpanzees. 

Se. ‘The Gardners raised chimpanzees. 


3f. Chimpanzees talked to Premack. 


(I am simplifying these examples. In standard genera- 
tive theory, the adjectives in prenominal position in 
the basic strings of | might actually indicate other 
embedded phrase markers, associated with the basic 
strings: 


4a. The chimpanzees were young, 
4b. The chimpanzees were clever. 


Ac. The chimpanzees were yerbal. 


Then a transformation would permute the adjectives 
from predicate-adjactive to prenominal position.) 

However, the base phrase markers of set 1 do con- 
tain embedded S’ symbols, so further transformational 
rules must be applied to the sets of base phrase mark- 
ers associated with (Ja, 2a); (1b, 2h) and (Ic, 2c) to 
yield the following complex curface sentences (with 
ussocinted surface strictures}: 


Ja. Many of the young chimpanzees that Rumbaugh 
trained learned quickly. 

5b. One of the cleyer chimpanaces that the Gardners 
raised signed fluently. 

5c. Several of the verbal chimpanzees who talked to 
Premack lied shamelessly, 


All the sentences in set | are called “matrix” sen- 
tences, and all the sentences in set 2 are called ‘‘con- 
stituent” sentences. One of the axioms of generative 
theory is that any base phrase marker can contain 
the symbol S’. In other words, a constituent phrase 
marker can be at the same time a matrix phrase 
marker, incorporating another embedded sentence 
within it. Transformational-generative syntax is thus 
said to be “recursive,” infinitely recursive, and this 
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is one of the reasons that the grammatically ac- 
ceptable sentences in any language are infinite in 
number and, theoretically, infinite in length as well. 
In performance theory, however, there is realistically 
a stopping point determined by the speaker’s inability 
to generate endlessly long sentences and the listener’s 
inability to understand them (Gleitman & Gleitman, 
1970). Such items as “The rat the cat my mother 
bought bit died” (Roeper, 1973) occur with a saving 
infrequency outside linguistic discussions.1 


Semantics and Phonology 


Deep structures “determine semantic interpreta- 
tion” and surface structures “determine phonetic in- 
terpretation.” ‘This seems odd at first, because if gen- 
erative theory is to account for the production of 
utterances, one might have supposed that semantics 
(what the speaker “means to say’) would determine 
deep structure, rather than the other way around. On 
the other hand, if generative theory is to account 
only for the reception (understanding) of utterances, 
one might have supposed that the heard utterance 
(the phonetic event) comes first, and then the listener 
infers an appropriate surface structure for what was 
heard, as in the traditional problem of speech per- 
ception. Fodor, Bever, & Garrett (1974, p. 389) resolve 
the puzzle: 


Direction of information flow makes no differ- 
ence in a grammar. According to Chomsky, 

. “the standard theory generates quadruples 
[of phonetic representations, surface structures, 
deep structures, and semantic representations]. 
It is meaningless to ask whether it does so by 
‘first’ generating [a deep structure then map- 
ping it onto a semantic representation (on one 
side) and a_ phonetic representation on the 
other].” (material in brackets is theirs) 


They go on to remark: 


In the case of performance models, however, the 
situation is quite different. .. . A performance 
model attempts to specify the actual sequence of 
computations which underlies the  speaker- 
hearer’s production and recognition of sentences. 
For such models, the direction of information 
flow is critical. (p. 390) 


1W. K. Honig pointed out to me that such sentences are not 
infrequent in German. German’s case inflections, very likely, 
help the reader to make sense of multiply embedded sentences, 
but English, almost devoid of case inflections, offers no such 
help. 
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In fact, any performance model that supposes that 
surface structures are generated by transformations 
from deep structures has the problem of accounting 
for deep structures themselves—that is, the problem 
of what sets the phrase structure rewriting rules into 
operation. In a rival theory of generative grammar 
called “generative semantics,’ “deep structures and 
semantic representations are identical” (Fodor et al., 
1974, p. 388), but in the standard theory of transfor- 
mational-generative grammar this is not so, and then 
performance theory requires a prior step in “the trans- 
lation function from mentalese to English” that maps 
“from formulas in mentalese to deep structures” 
(Fodor et al., p. 389). We shall return to the problem 
of “translating from mentalese to English” at the end 
of the chapter. 


THE PSYCHOLOGICAL REALITY OF 
TRANSFORMATIONAL-GENERATIVE 
GRAMMAR 

As a general approach to formulating the prob- 
lems of syntax, the division between deep structures 
(or their relatively unmodified manifestations as 
kernel sentences) and surface structures (generated by 
more elaborate transformations from deep structures) 
seems plausible. There is evidence (e.g., Brown, 1973) 
that children’s first sentencelike utterances (i.e., ut- 
terances longer than a unit verbal response) have the 
simple and straightforward syntactic structure implied 
by the notion of kernel sentence. Only later do utter- 
ances appear that have more complex syntactic struc- 
ture. The point transformational grammarians make 
is that these more complex utterances could be gen- 
erated from one or more kernel sentences (with equiv- 
alent meaning) by application of transformational 
rules. It does not seem implausible that complex sen- 
tences are so generated, that kernel sentences are psy- 
chologically more fundamental, that speakers learn to 
generate them before they learn to transform them, 
and that underlying any complex utterance is one or 
more kernel sentences (or base phrase markers) that, 
in some sense, are prior to it and give rise to it. The 
idea is not unlike Skinner’s theoretical distinction be- 
tween “primary” verbal behavior and autoclitically 
modified verbal behavior. 

These issues, however they are formulated, are 
within the scope of a functional analysis of verbal be- 
havior, for which the central problem is to determine 
the behavioral laws that account for the performances 
of speakers and listeners. If, as seems possible, the 
formal distinction between deep structure and _sur- 
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face structure reflects some aspect of psychological 
reality, and if, as seems certain, the formal structure 
of utterances is hierarchical, then functional theory 
must deal with the questions: What are the behav- 
1oral variables that determine (or that instantiate) 
deep structures and surface structures? and What 
“meanings” are carried by the internal, hierarchical 
structure of utterances that are not carried by the in- 
dividual response terms viewed as an unstructured 
string? (See Robinson, chapter 21] in this volume.) 
Psycholinguistics, long in the thrall of transforma- 
ional-gencrative syntax, scems lately to be shifting 
to a greater concern with semantic problems (i.e., 
problems of stimulus control) and pragmatic problems 
(1.e., problems of the reinforcement contingencies that 
shape and maintain verbal behavior) (cg., Brown, 
1973; Carter, 1975; Farwell, 1975: McNeill, 1974: 
Moore, 1973: Slobin, 1975: Thompson & Chapman, 
1975). This may reflect psycholinguists’ despair at 
solving the problems of syntax set by transformational- 
generative grammarians, but it may alternatively re- 
Hect their growing sense that the solution to these 
syntactic problems will come from a fuller analysis of 
semantic (stimulus control) and pragmatic (reinforce- 
ment) variables. 


SKINNER’S FUNCTIONAL THEORY 
OF VERBAL BEHAVIOR 


Skinner’s theory, as MacCorquodale (1969, 1970) 
emphasized, is a plausible extrapolation of the laws of 
animal operant behavior to human verbal behavior, 
embodying the hypothesis that the facts of verbal 
behavior can be accounted for by application of the 
familiar “three-term contingency” of operant analysis. 
In this case, the three-term contingency must explicate 
two functional relations: (1) the function between dis- 
criminative stimuli (or a motivational state) and the 
response they control, and (2) the function between 
the operant unit constituted of this antecedent vari- 
able-response relation and the reinforcing conse- 
quences shaping the operant (S—R) unit, 

Skinner defined the domain of verbal behavior as 
operant behavior whose third term (reinforcement) is 
mediated by other organisms. The definition was in- 
tended to capture, in a nutshell, the communicative 
function of verbal behavior, the fact that it constitutes 
a social transaction between a speaker and a listener 
and that the transaction has utility for both partici- 
pants. Having defined the domain, Skinner proceeded 
to subdivide it by classifying verbal operants in terms 
of their antecedent controlling variables. 
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Mands 


A mand is a verbal response controlled by an ante- 
cedent motivational state. Formally, the motivational 
State occupies the position in the threee-term contin- 
gency more commonly occupied by a discriminative 
stimulus. It sets the occasion for reinforcement of a 
specific class of response topographies. For example, 
food deprivation sets the occasion when the mand 
“Food” or the mand “Gimme eats’ can be reinforced 
by a listener’s supplying food. Because food is only a 
reinforcer when an organism 18 food-deprived, depriva- 
tion sets the occasion for reinforcement of the class of 
food mands. There is a peculiarly close fit between the 
motivational condition setting the occasion for rein- 
forcement of a mand and the particular form of rein- 
forcer appropriate to that mand, which Skinner 
summed up by saying that the mand “‘specifies its 
reinforcement.” We can avoid the teleological seduc- 
tions of this expression, however, by focusing on the 
antecedent motivational control of the mand. (Note, 
however, that from the listener’s viewpoint the mand 
does “specify” very explicitly what behavior the lis- 
tener should engage in.) In Verbal Behavior, Skinner 
almost invariably speaks of the antecedent conditions 
controlling emission of the mand as “‘states of depriva- 
tion or aversive stimulation,” but this seems to piVve 
an unwanted drive-reduction flavor to the mand that, 
as MacCorquodale (1970) pointed out, was never in- 
tended. Among “states of deprivation’ Skinner in- 
cluded such items as needing a pencil to complete a 
drawing, but completing a drawing is net your usual 
primary reinforcer, and needing a pencil is not your 
usual primary state of deprivation. MacCorquodale 
(1970) suggests that mands for such conditioned rein- 
forcers as pencils to complete drawings are best un- 
derstood as under the control of seme other, INGreE 
primary deprivation correlated with some other, more 
primary reintorcer with which pencils and completed 
drawings haye been paired in the speaker's past his- 
tory. I wonder if a simpler solution would not be to 
abandon the primary-conditioned reinforcer distinc- 
tion, aS Premack (1959) did, and simply to regard rein- 
forcers and their correlated motivational states as in 
various ways situationally determined. ‘The neo- 
Hullian concept of incentive motivation seems a 
suitable way to talk about the antecedent motivational 
State controlling a mand. (I shall return to this 
shortly.) 


Echoics 


An echoic is a verbal response controlled by an 
antecedent verbal stimulus where, in addition, rein- 
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forcement depends on a strict one-to-one correspond- 
ence between the sound pattern of the discriminative 
stimulus and the auditory product of the response. 
The stimulus and the auditory product of the re- 
sponse must match, within limits set by the reinforc- 
ing community. For example, if a parent says ‘““Dino- 
saur,” and a child repeats, “Dinosaur,” the parent’s 
utterance serves as a verbal stimulus for the child’s 
echoic response. Fragmentary echoics also occur. In 
this case some part of the response matches the sound 
pattern of the prior stimulus, but the match is not 
complete. Alliteration is an example of fragmentary 
echoic verbal behavior. If someone says, ‘Sing,’ and 
a speaker responds, ‘‘A silly song,” the repeated s 
sounds suggest fragmentary echoic control. Rhymes 
are other examples of fragmentary echoics. Whenever 
the sound match is not complete, then other variables 
must be supposed to have determined other aspects 
of the speaker’s response. 


Intraverbals 


An iniraverbal is a verbal response controlled by 
an antecedent verbal stimulus where, in addition, the 
sound match is lacking. It is the verbal unit par excel- 
lence of conventional paired-associates verbal learning. 
If a parent says, “One, two,” and a child responds, 
“Button your shoe,” the response as a whole is most 
likely a rote-learned intraverbal. (The sound match 
between the last part of “two” and the last part of 
“shoe” suggests fragmentary echoic control as well. In 
a given instance fragmentary echoic control may not 
occur: the child’s “shoe” may simply be part of the 
rote-learned unit phrase “Button your shoe.” Still, the 
historical origin of the rhyme certainly owed some- 
thing to fragmentary echoic control.) More “signifi- 
cant” examples of intraverbals are easy to find. A lot 
of the educated speaker’s repertoire consists of ‘“‘book- 
learned” or otherwise rote-learned intraverbals. 


Tacts 


A tact is a verbal response controlled by a non- 
verbal discriminative stimulus. It is the unit most 
susceptible to “semantic” generalization, what Skin- 
ner and psycholinguists (e.g., Moore, 1973) have both 
called ‘“extension.’”’ When a tact, initially learned to 
the particular package of stimulus properties making 
up a specific, dated environmental event, later trans- 
fers to other packages of stimulus properties making 
up other specific, dated environmental events, and 
when, in addition, the basis of transfer is exactly those 
stimulus properties the verbal community regards as 
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“definitive” for that “word” (the community imposes 
differential reinforcement contingencies to bring this 
about), then, for Skinner, we have a case of generic 
extension of the tact and, for cognitive theorists, a case 
of concept learning. (See Robinson, Chapter 21 in this 
volume.) Despite the vocabulary difference, every- 
one is talking about the same phenomenon. For ex- 
ample, when a child who has learned to tact an ac- 
tivity performed by his parent as ‘“walkiny’’ extends 
the response to other instances of the activity, such 
as his own walking or the dog’s walking, the child has 
acquired the generic tact “walking,” or alternatively, 
the concept of walking. 

When the basis of transfer of the tact is accidental 
stimulus properties that just happen to be shared by 
two environmental events (one of them, of course, the 
event in whose presence the tact was originally rein- 
forced), Skinner speaks of metaphoric extension of 
the tact, and cognitive theorists of overextension of a 
word or concept. Again, despite the vocabulary differ- 
ence, everyone is talking about exactly the same phe- 
nomenon. The literature on children’s language 
acquisition abounds with amusing examples of meta- 
phoric extension. For example, a child who first 
learned to utter “moon” in the presence of the 
moon later extended the response to round postmarks 
and the letter O. (This is so common a metaphoric 
extension, apparently, that we have an expression 
“moon-faced,’’ meaning round-faced.) Another child 
who learned the response ‘‘fly’” in the presence of the 
appropriate little insect later extended the response to 
his own little toes (Clark, 1973, pp. 80-81). 

Finally, when the basis of transfer of the tact is not 
shared properties of two environmental events, but 
rather some fortuitous association of events, then 
Skinner speaks of metonymic extension and everyone 
else speaks of plain “associative learning” or, to revive 
an apt phrase of ‘Thorndike’s (1911), “associative shift- 
ing.” According to Ervin-Tripp (1973, p. 267), meto- 
nymic extension may be the origin of the syntax of 
the possessive in children’s speech. It is common, for 
example, to find a young child saying, “Mommy,” 
when pointing to Mommy’s shoe. (Later on the child 
will learn to tact the shoe as a generic, tact Mommy 
as a metonymic, order them and affix ~-s to the 
metonymic, and utter “Mommy’s shoe.’’) 

An example of metonymic extension offered by 
Skinner (1957) and discussed by MacCorquodale (1969) 
is uttering the response “‘orange’’ in the presence of 
a fruit bowl or a breakfast table that sometimes con- 
tains an orange but does not in this instance. The 
sophisticated speaker will usually qualify the response 
with other responses, such as “That fruit bowl makes 
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me think of oranges,” or ‘Strange, there’s no orange 
on the breakfast table this morning.” Only “orange” 
is a “tact in metonymic extension.” The qualifying 
terms (‘no,” “makes me think of”) are autoclitics. 


Autoclitics 


An autoclitic, finally, was defined by Skinner as a 
verbal response controlled by stimulation (demonstra- 
ble or merely hypothesized) arising from the speaker’s 
immediately prior (covert, usually) verbal behavior. 
Autoclitics are first cousins of intraverbals, in the 
sense that their controlling stimuli are said to be 
preceding (albeit covert) verbal stimuli (auto = “‘self,” 
clitic = “leaning on”). The autoclitic depends for its 
occurrence on a self-generated verbal stimulus or, 
really, some more abstract and hypothetical preverbal 
(possibly even central) event. Autoclitics can have the 
clear topographical dimensions of an uttered word, or 
they can be, so to speak, dimensionless, being de- 
tectable only by their effect of ordering these more 
familiar terms syntactically. 


Some examples of autoclitics have just been men- 
tioned. In “That fruit bowl reminds me of oranges” 
speakers, so to say, find themselves about to utter 
“oranges” but, noticing that the response is not at 
peak strength, or perhaps vaguely sensing that the 
circumstances are not quite right for saying 
“oranges,” qualifies the response in the process of 
uttering it. The qualifying autoclitic, having the 
dimension of a “word,” is “reminds.” The sentence 
illustrates the difference between “primary” verbal 
responses and autoclitic responses. “Oranges” is a 
metonymic tact controlled by the fruit bowl as a dis- 
criminative stimulus, “Fruit bowl” is a generic tact 
controlled by the same discriminative stimulus. “Me” 
is perhaps a generic tact of a complex private event, 
or perhaps should be regarded as another autoclitic: 
a response to the speaker's privately sensed disposition 
to speak, The distinction between primary verbal be- 
havior (the immediate responses to the stimulating 
environment) and autoclitic behavior (complex re- 
sponses to the disposition to speak) is reminiscent of 
the distinction between deep and surface structure, 
but the correspondence is not exact. (More on this 
later.) 

Skinner suggested that the syntactic structure of 
utterances is determined, at least in part, by autoclitic 
frames. ‘These would correspond, roughly, to such 
syntactic sequences as noun phrase—predicate phrase, 
or determiner—noun. Thus, although “reminds” in 
“That fruit bowl reminds me of oranges’ is an auto- 
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clitic having clear response dimensions and is trig- 
gered by the prior strengthening of “oranges,” “fruit 
bowl,” and something peculiar about the stimulating 
conditions evoking “oranges,” the temporal arrange- 
ment of the terms into a grammatical sentence re- 
quires the notion of a (dimensionless) autoclitic frame. 
(The fixed-interval scallop is, in this sense, a ‘“dimen- 
sionless frame’ into which concrete responses such as 
bar presses or key pecks “fit.” “Temporal frame’ is 
perhaps a more apt expression.) 

The dificulty with the concept of an autoclitic 
frame is the same difficulty posed by deep structures, 
or Robinson’s (this volume) cards with side codes but 
no words written on them. Robinson’s card metaphor 
suggests how the autoclitic frames that “mediate” the 
syntactic connections between words might be learned 
within a reinforcement theory framework, but none of 
the present notions (autoclitic frames, deep structures, 
coded cards) helps one to imagine what the physical 
embodiment of syntactic structure might be. All of 
these notions imply some mysterious preverbal 
processes within the organism, to which Fodor et al. 
(1974) give the name “mentalese.” Autoclitic frames, 
deep structures, and coded cards equally represent 
abstract theories about syntactic behavior, and as 
theoretical concepts they can, of course, function with- 
out reference to physical events within the organism. 
To the extent that one takes them as references to real 
biological events, they must be understood as meta- 
phors. (More on this, too, later.) 


COMMENTS ON SKINNER/S 
FUNCTIONAL THEORY 


This brief review has laid out the bare elements of 
Skinner's theory of verbal behavior. For fuller ex potl- 
tions, the interested reader should consult MacUor- 
quodale (1969, 1970), Segal (1975a), or best of ail. 
Skinner himself (1957). As MacCoerqusedale (1969) 


noted, 


Everything considered, the basic explanatory 
apparatus seems very meager, while verbal be- 
havior is very complex. [But] the power of a 
single variable is seen to multiply when we take 
into account its multiplicity of effects... . 
First, a single variable controls many responses, 
giving a speaker a great deal to say even in a 
static environment. ... [Second,] if several 
responses are concurrently strong, additional 
variables . . . control the order of their emis- 
sion and whatever response selection, and rejec- 
tion, occurs. (pp. 838-839) 
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Skinner’s theory remains in its pristine state. It has 
been too little modified since its publication in 1957, 
first because behaviorists have apparently not been 
interested in subjecting it to the experimental test and 
revision appropriate to theories, and second, because 
psycholinguists, who are acquainted with real data 
germane to the theory, have not been at pains to im- 
prove a theory they scorn. Even so, the theory is 
powerful, it 1s in touch with much of the data that 
psycholinguists have uncovered, and it is available as 
a means of systematizing and making sense of what 
psycholinguists are finding out, especially now that 
their interests have begun to shift to “semantics” and 
“pragmatics.” In the interests of advancing under- 
standing as rapidly as may be, one hopes that psycho- 
linguists will not insist on working out all over again 
the integrative insights into semantic and pragmatic 
issues that Skinner’s book makes available, free for 
the asking. 


Comments on the Mand 


What is one to say about the reinforcers, and the 
associated controlling motivational states, for novel, 
generative mands? Suppose, to take an example of 
Skinner’s (1957), a child in a toy store sees an attrac- 
tive new toy whose name he doesn’t know. On being 
told by his parents that it’s a “doodler,’ he immed- 
lately says, “Buy me a doodler!’’ As Skinner notes, 
“He has never been reinforced for this response in the 
manner required to construct a mand” (1957, p. 188). 
If the response has never been reinforced, it could 
never have come under the control of the child’s new 
motivational state. What accounts for its emission, 
then? Skinner suggests, “It is possible that all mands 
which are reinforced by the production of objects or 
other states of affairs may be interpreted as manding 
the behavior of the listener and tacting the object or 
state of affairs to be produced” (1957, p. 180). Let us 
consider this further. 

‘The doodler is an incentive, automatically induc- 
ing a novel motivational state. Piaget (1952) gives 
many examples of novel objects apparently function- 
ing automatically as incentives—automatically, that is, 
presupposing certain prior sensorimotor or “cogni- 
tive” learning. Novel incentives of this kind do not 
clearly sort out as primary or conditioned reinforcers. 
Perhaps they are best regarded as induced (Segal, 
1972) incentives, their incentive value arising in some 
yet-to-be-explicated way from prior learning. Being 
incentives, they induce a correlated motivational state 
(incentive motivation) and so insure that any behavior 
that results in obtaining the incentive will be rein- 
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forced (that, as I take it, is the definition of incentive). 
(See Ayllon & Azrin, 1968, for a demonstration of 
human incentive motivation at work.) 

Given a novel motivational state induced by a 
novel incentive and an environmental setting which, 
in concert with incentive motivation, has already 
come to control mands of the form “Buy me 

, the mand frame is emitted. The “gen- 


erative’ appearance of the mand frame on this novel 


occasion, then, is understandable if all induced incen- 
tive motivations, novel or not, have enough in com- 
mon, and ?f all stores (places where incentives are 
bought) have enough in common, so that these two 
variables together are able to support generative ap- 
pearance of the response form “Buy me = 


The appearance of the response ‘‘a doodler” in the 
blank in the mand frame is still to be explained. 
Other things being equal, a response that tacts an 
attractive incentive should have a high probability of 
emission (as a tact, mind you, not yet as a mand). 
There must have been occasions in the past when the 
child “merely tacted’’ an incentive but was under- 
stood by his parents to be manding, and so (unex- 
pectedly) the child found the incentive object handed 
over to him. Supposing that the incentive had aroused 
an incentive motivation, obtaining the incentive 
should have reinforced the tact response, which from 
then on could be evoked, as mand, by the motiva- 
tional state, as well as evoked, as tact, by the object 
itself. After a few such experiences, all responses that 
were previously acquired as tacts (when the child was 
not motivated to have the object) might be available 
also as potential mands when the child was in the 
appropriate incentive motivational state. In short, a 
new set of conditional discriminations, with genera- 
tive effects, would have been learned. Given an incen- 
tive, and given availability of a tact of the incentive, 
and given a correlated incentive motivation, the child 
would emit the tact. The specified reinforcement 
would then follow this emission of the tact, which 
would then come under independent control of the 
motivational state and would no longer depend, as 
mand, on the presence of the incentive to evoke it. 
(Notice one new factor here. We presuppose that, 
given a novel incentive motivation, the speaker can 
discriminate the incentive object that roused the 
motive and will not be likely to tact irrelevant ob- 
jects.) The analysis suggests why speakers are prone 
to tact their motivational states by reference to ex- 
pected reinforcers, giving to statements of intention 
their characteristically teleological flavor. 

This account is given at length to suggest what 
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must be a common and powerful basis for generative 
(creative, productive) manding: tacting incentives. We 
shall find problems enough later on when we come to 
consider generative syntax, but generative echoic re- 
sponding, generative tacting, and generative manding 
do not seem to present serious difficulties for Skinner’s 
functional theory of verbal behavior. Working 
through the functional analysis in individual cases 
and testing it against experimental or naturalistic 
data are not necessarily easy, but they can be done. 
The general strategy for attacking the semantic and 
pragmatic problems of language seems clear. 


The Rainforcamant of Verbal Behavior 


Once Skinner leaves the discussion of the mand, the 
third term in the three-term contingency (the reinferc- 
ing consequences of verbal responses) is taken pretty 
much for granted—not because reinforcement is unim- 
portant in the acquisition and maintenance of verbal 
behavior, but rather because the reinforcements flow 
so naturally from the fact that the speaker is commu- 
nicating with a responsive listener. Except for a limited 
class of mands that specify primary commodity rein- 
forcers the listener is to supply, the reinforcers for 
verbal behavior mostly reside in the social response 
of the listener. Furthermore, they may be self-supplied 
by the speaker, in his or her role as a listener. Much 
of a mature speaker’s verbal behavior is addressed to 
oneself, as self-instructions and the like, Whatever 
reinforcers ultimately flow from the speaker-listener’s 
reactions to their own self-instructions provide the 
necessary conditions for the maintenance of self-ad- 
dressed verbal behavior. 

Finally, intermittent reinforcement must play at 
least as important a role in the acquisition and main- 
tenance of verbal behavior as it plays in the control 
of other operant behavior. It would not violate the 
spirit of a behavioral analysis if it turned out, even, 
that new verbal operants (new stimulus-response 
units) are often learned without any evident rein- 
forcement. (MacCorquodale, 1970, makes the same 
point.) A kind of secondary contiguity principle may 
operate, such that the direct reinforcement of some of 
the speaker’s verbal acquisitions generates a strong 
and persisting tendency to acquire new verbal oper- 
ants even when there is no possibility of their imme- 
diate reinforcement. Experimental analyses of gener- 
ative imitation—for example, Sherman (1971)—seem 
to demonstrate the operation of such a secondary 
contiguity principle. Such a principle also seems the 
most parsimonious way to account for much of the 
paired-associates learning that goes on in the verbal 
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learning laboratory, as well as in the natural environ- 
ment. (Possibly the relatively low energy requirements 
involved in vocal responding make it, like perceptual 
responding, especially available to a simple contiguity- 
learning process.) 


The Stimulus Control of Verbal Behavior 


When we leave the mand, we leave a dominating 
concern with reinforcement and correlated motiva- 
tional states, From here on, the focus of the analysis 
is on stimulus control. Echoic control leads eventually 
to the emergence of the “minimal (phonemic) echoic 
repertoire,” enabling speakers to match. phoneme by 
phoneme, verbal stimuli they hear, even for the 
hirst time. Generative echoic responding raises no ex- 
planatory preblems just because the echoic stimulus 
provides the speaker with such detailed and explicit 
instructions on what to do. Also arising out of the 
minimal echoic repertoire are formal classes of re- 
sponses controlled by isolated phonetic featurés of 
verbal stimuli. In the dominant vernacular of the 
day, responses in the “memory store” can be ads 
dressed by such “retrieval cues” as “Cive me some 
words that begin with p” or “Give me some werds 
that rhyme with luck” or “Give me some words in 
spondaic meter.” 

Among the by-products of the development of 
intraverbals and tacts are the appearance of thematic 
classes of responses controlled by events (whencé 
“episodic memory’) and by words (whence ‘‘semantic 
memory’). Ihe organization of verbal memory is cur- 
rently an active topic of research. Skinner argued that 
ali the organization to be found in the speaker's inter- 
related tact and intraverbal repertoires (1.e., in his or 
her thematic classes) were the result of sheer contiguity, 
but contemporary memory researchers, reviving some 
of the pércéptual iisishts of Céstalt Bey ooey have 
insisted that other factors besides sheer contiguity are 
important. Hilgard and Bower (1975) review some of 
the arguments in their chapter on Gestalt psychology. 
To summarize their arguments, responses tacting 
events that occur together in the speaker's perceptual 
experience are much more likely to become inter- 
related in memory (such that the perceptual cues con- 
trolling each tact as a generic will also have a high 
probability of evoking the other tacts as metonymics) 
if the original events were so structured as to be per- 
ceived as related to one another in some intrinsic way. 
For example, two stimulus features structured so as to 
be attributes of a coherent figure are more likely to 
evoke one another’s generic tacts as metonymics than 
are two stimulus features presented with identical 
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spatial contiguity but so structured that one is per- 
ceived as a figural attribute and the other as an inde- 
pendent attribute of the background. (See Hilgard 
and Bower, 1975, Figure 8.6, p. 265, for several 
examples taken from experiments by Asch, 1969.) The 
arguments against sheer contiguity seem, on balance, 
to be persuasive. It seems clear now that there are 
much more interesting things to say, and to discover, 
about the organization of verbal memory than simply 
that it derives from contiguity of perceptual experi- 
ences. 

Despite the fall of sheer contiguity as the exclusive 
organizing principle of thematic response classes, Skin- 
ner’s insight into the organization of the verbal reper- 
toire in terms of controlling stimuli remains valid. 
Some of the phenomena that memory researchers 
puzzle over seem not quite so puzzling, nor their con- 
temporary explanations quite so original, viewed in 
this light. “Encoding specificity” (Tulving & Thom- 
son, 1973) is the current expression for the inde- 
pendence of verbal operants controlled by different 
antecedent stimuli. Some of the phenomena of en- 
coding specificity, such as the greater ease of “recall” 
over “recognition” in some circumstances (Tulving, 
1974; Tulving & Thomson, 1973), or the appearance 
of formal (‘auditory’) “errors” in immediate free re- 
call (“short-term memory”), replaced by thematic 
(“semantic”) “errors” in delayed free recall (“long- 
term memory’) (Kausler, 1974), could have been pre- 
dicted from Skinner’s analysis. It must be said, how- 
ever, that they were not predicted. Alas, no original 
research on verbal memory flowed from Skinner’s 
theory, because no one conversant with it was inter- 
ested in doing the research, and so equivalent theo- 
retical explanations have had to be reinvented with- 
out help from Skinner, and behaviorists can take no 
credit for the interesting results that have been emerg- 
ing in research on verbal memory. 


Control by Relations Between Stimuli 


This is not to say there are no difficulties in Skinner’s 
account of the stimulus control of verbal responses. 
There are basic problems in the definition and speci- 
fication of controlling stimuli. For example, Skinner 
suggested that the regular past tense inflection on 
English verbs, -ed, is a tact controlled by “that subtle 
feature of the environment called action-in-the-past.” 
But it is not clear that action in the past is an ex- 
plicit feature of the physical environment. Other rela- 
tional terms present the same problem. Is plurality, 
which is said to control the tact inflection -s on nouns, 
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an explicit feature of the physical environment? What 
are the physical dimensions of the stimulating en- 
vironment that control relational terms such as 
“familiar,” “similar,” “Mozart” (said of a piece of 
music), “Dutch” (said of a painting), and so on? Cog- 
nitive theorists (e.g., Chomsky, 1959) object to Skin- 
ners treating the stimuli controlling such tacts as 
purely objective, physical events. They would argue, 
for example, that the relation of similarity is not in 
the physical stimuli, but in the way organisms (are 
built to) react to (“process”) them. Skinner would 
agree, The issue, of course, is not whether organisms 
can respond in such a way as to exemplify the rule 
“Pick the comparison stimulus that is similar to the 
sample stimulus.” The literature on matching to 
sample clearly demonstrates that they can. The issue is 
whether it is necessary to posit a “cognitive process” 
or a “comparator device’ within the organism, over 
and above what is in the external physical world, to 
account for the organism’s ability to match to sample. 
I think a comparator device is called for, and be- 
havior theory will have to make room for it. In gen- 
eral, relations are not describable solely in physical 
terms. Behaviorists can, of course, give an operational 
definition of what it means for an organism’s behavior 
to be controlled by relations between physical stimuli, 
but the fact that such operational definitions require 
mention of an organism’s behavior makes such defini- 
tions question begging, so far as cognitive theorists 
are concerned. 

A point that is rarely mentioned, and little under- 
stood, 1s that it does no violence to functional be- 
haviorism to concede that the stimulus relations 
which control behavior may not literally be in the 
environment but in the way the organism reacts to the 
environment. Functional behaviorists take for granted 
that the organism brings something of its own to the 
behavior-environment interaction that defines the 
operant. Perhaps this should be mentioned more 
often. MacCorquodale (1970) spent a good deal of 
space on this matter, but it was almost as an aside that 
Skinner himself remarked, in Verbal Behavior, “All 
behavior, verbal or otherwise, is subject to Kantian a 
priori’s in the sense that man as a behaving system has 
inescapable characteristics and limitations” (1957, p. 
451). Among the “inescapable characteristics” that 
organisms bring to an operant transaction with the 
environment is their disposition (call it “cognitive” or 
not) to perceive (even impose) relations between phys- 
ical events. The relations do not inhere in the physical 
events themselves; Skinner pointed out, for example, 
that it is not literally a temporal relation between 
earlier and later occurrences of a stimulus that marks 
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the later-occurring stimulus as “‘familiar,’’ but some 
change in the way the organism responds to earlier 
and later occurrences. Still, perceived relations among 
physical events are describable, even if it takes an 
organism (such as the experimenter) to describe them, 
and this is all that a functional analysis requires. Only 
if the experimental subject were able to perceive a 
relation that the experimenter could not would func- 
tional analysis be in trouble, for then the experi- 
menter would be unable to formulate the deter- 
minants of the subject’s behavior, What divides 
behaviorists and cognitive theorists on this issue 1s 
whether to focus on analyzing the structure of the 
environment or the structure of the organism. Clearly, 
beth participate in determining the functional rela- 
tions between organism and environment embodied in 
operant behavior. The behaviors focus on the en- 
vironment, regarding the internal structure of the 
organism as beyond reach of current experimental 
technique. The cognitive psychologists focus on the 
organism, even if they must invent the internal struc- 
ture they seek to understand. 


Transfer of Verbal Responaes ta Naw Stimuli 


Although the problem of similarity does raise ques- 
tions about the necessity of “filtering” the description 
of the stimulus relations controlling the tact “similar” 
through another perceiving organism, this problem is 
not necessarily involved in generative (productive) 
extensions of verbal responses. Skinner's treatment of 
extensions of stimulus control is a “‘common elements 
theory of transfer,” (Prokasy and Hall 1963), Perhaps 
that is why Skinner avoided the expression “stimulus 
generalization” in Verbal Behauior, an avoidance 
MacCorquedale (1969) found puzzling. Still, a too- 
casual use of the term slymulus causes confusion. If a 
tact were extended from ene “stimulus” te anether, 
nonidentical “stimulus,” the extension would have to 
be on the basis of some troublesome “similarity.” But 
if a tact is extended from one collection of stimulus 
elements or properties or features, each of which can 
be separately specified, to another collection of stim- 
ulus elements or properties or features, on the basis of 
shared, common, identical elements or properties or 
features, then a ‘‘comparator device” that evaluates 
degree of similarity is not required. For cognitive 
theorists the problem of “recognition” of identical 
stimuli then replaces the problem of evaluating the 
degree of similarity of nonidentical stimuli. Behavior- 
ists can shed no light on the question of stimulus 
recognition. They simply take it for granted. 


639 


The Functional Approach to Syntax 


The difference between functional and cognitive 
approaches to verbal behavior can be illustrated in the 
approach to problems of syntax. Schumaker and Sher- 
man (1970) applied differential reinforcement con- 
tingencies that succeeded in getting retardates to utter 
the verb inflections -ed and -ing in appropriate intra- 
verbal contexts—for example, in such sentences as 
“Now the man is painting. Yesterday he. . 
painted” and “Yesterday the man painted. Now he is 
14+ painting.” Echoic prompts were initially used to 
induce the inflectional responses on sample verb stems 
such as paint, but eventually the subjects added the 
inflections to novel untrained verb stems such as skate 
under appropriate intraverbal control of the phrases 
“Now he is...” or “Yesterday he .. . .” This qual:- 
fies as generative (productive) syntactic behavier be- 
cause (1) from an unfamiliar stimulus word such as 
skating subjects extracted the verb stem (skale); (2) 
they added the alternate ending te yield skated; and 
(3) they uttered the two response elements in gram- 
matically correct order (skated, not edskate). 

In several experiments (Garcia, Guess, & Byrnes, 
1973: Guess, 1969: Guess & Baer, 1973; Guess, Sailor, 
Rutherford, & Baer, 1968; Sailor, 1971), experimenters 
trained retardates to utter the plural noun imflection 
- under appropriate nonverbal stimulus control—that 
is, to tact plurality. Again this qualifies as penerative 
syntactic behavior because (1) subjects added -» te 
novel noun forms on which they had not had echoic 
training in the plural; (2) they uttered the plural 
forms in response to newly paired objects (familiar 
previously only as singletons}; and (3) they uttered 
the response elements in grammatically correct order 
(e.g. rocks, not svack). Again, the experimental 
methods employed to bring about predustiys syntac 
tive behavior involved differential reinforcement of 
echoically promptéd ESSppSiisE5 and gradual ahifting of 
stimulus control from echoic prempis to nonysrbal 
stimuli (single objects or paire of objects). Lutzker and 
Sherman (1974) employed comparable procedures {6 
produce generative utterance of the verb forms ts and 
are under appropriate nonverbal stimulus control of 
pictures of single or several actors engaging in some 
action, 

Whitehurst (1971, 1972) taught 2-year-old children 
a miniature artificial language consisting of some 
nonsense syllables that functioned as color “adyjec- 
tives” and some nonsense syllables that functioned as 
“noun” labels of geometric forms. A procedure in- 
volving differential reinforcement, echoic prompting, 
and shifting of control from echoic to nonverbal 
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(color and form) stimuli created appropriate func- 
tional response classes (a class of color tacts and a class 
of form tacts) and appropriate syntactic ordering of 
response terms in two-word utterances. The prescribed 
order was color tact (adjective) followed by form tact 
(noun), as in standard (surface) English. By these pro- 
cedures Whitehurst succeeded in producing generative 
syntactic behavior: given novel combinations of famil- 
lar color-plus-form stimuli, the children correctly 
tacted the two properties in the prescribed adjective- 
noun order. 

Premack (1970) has also reported getting his chim- 
panzee subject, Sarah, to emit generative tact se- 
quences in syntactically prescribed orders; Gardner 
and Gardner (1969, 1971, 1975) have reported gen- 
erative syntactic response sequences in their chimpan- 
zee subject, Washoe; and Rumbaugh, von Glaserfeld, 
Warner, Pisani, and Gill (1974) obtained generative 
syntactic mand sequences in their chimpanzee subject, 
Lana. 

These are only an exemplary few of the spate of 
research reports in recent years dealing with a func- 
tional analysis of simple syntactic behavior. In all this 
work, explicit training procedures involving differ- 
ential reinforcement, prompting, and stimulus fading 
(associative shifting) were employed to bring about 
the productive syntactive behavior. Premack, however, 
has concisely stated the demurrer that divides be- 
havioral and cognitive theorists in dealing with prob- 
lems of verbal behavior: ‘‘A strict training procedure 
is not an explanation of how, as a result of carrying 
out the prescribed steps, the organism accomplished 
the function in question. A recipe is a method, not a 
theory” (1970, p. 107). Functional behaviorists are 
content to devise “strict training procedures” that suc- 
ceed in producing generative syntactic behavior in 
experimental subjects. In the course of devising effec- 
tive procedures, such research unavoidably identifies 
critical functional variables controlling syntactic be- 
havior, insofar as these reside in environmental con- 
tingencies among the antecedent variables, the be- 
havior, and the reinforcing consequences. A func. 
tional analysis, however, does not attempt to specify 
what, apart from its ontogenetic history, the organism 
brings to the learning task. Premack (1970) suggested 
that strict training procedures do no more than teach 
the experimental subject to “map concepts” it already 
possesses—that is, to tact the environmental events and 
relations it already perceives. (Also see Robinson, 
Chapter 21 in this volume.) Premack doubted the 
possibility of teaching subjects to perceive relations 
they do not perceive spontaneously. This is an open 
question for the functional behaviorist. It is interest- 
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ing that some psycholinguists (e.g., Brown, 1973) en- 
tertain the possibility that the environmental relations 
a young child tacts in his first multiunit utterances 
are exactly those relations he learned to respond to 
nonverbally in the immediately prior sensorimotor 
stage of development (Piaget, 1952). In any case, it is 
certainly true that behaviorists have shown no inter- 
est in the question of what internal cognitive machin- 
ery makes possible the perception of relations, al- 
though this kind of question is the meat-and-potatoes 
of cognitive theorists. Premack is correct: a functional 
analysis is not a cognitive theory (although Skinner’s 
hypothesis of the autoclitic probably qualifies as a 
cognitive theory). 


‘ 


FUNCTIONALISM VS. MENTALISM 


Theories 


The question of whether to have or not to have 
theories (alternatively, what constitutes a proper ex- 
planatory theory) is at the heart of the so-called men- 
talist-behavioral controversy. Cognitive theorists often 
employ the vocabulary of mentalism, but a look at the 
footnotes (e.g., Chomsky, 1965, pp. 193-194) reveals 
that all they mean by mind is a theory about the 
internal machinery of the organism. Sometimes such 
theories are couched in the metaphors of neurophys- 
iology, sometimes in the metaphors of computer and 
information science, sometimes in the metaphors of 
plant genetics (unfolding of phenotypic characters in 
a favorable soil, and so on). But behaviorists should 
take note that the temptation to theorize in metaphors 
appropriate to another domain is so great, up against 
the mysteries of syntax, that even Skinner succumbed. 
The concept of the autoclitic is a groping step toward 
a theory of syntactic behavior, and, like the theories 
Skinner elsewhere (1950) enjoined, the theory of the 
autoclitic is metaphoric. (MacCorquodale, 1969, dis- 
agrees.) It proposes that the unobserved processes that 
must be supposed to underlie the observed (surface) 
syntactic structure of verbal behavior are in some 
sense operant, albeit covert, even “preperipheral.” It 
remains to be seen if Skinner’s choice of metaphor for 
the unseen processes underlying syntax was apt. ‘The 
internal parts of machines rarely resemble their ex- 
ternal parts, and so other cognitive theorists (for in 
this matter Skinner himself must be classed as a cog- 
nitive theorist) have opted for other metaphors. Note 
that “deep structure” and “surface structure” and the 
whole transformational-generative apparatus make up 
an extended metaphor, too, insofar as they are 
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claimed to describe psychological processes and not 
simply to stand as an abstract, formal analysis. 


Nativism vs. Environmentalism 


The so-called nativist-environmentalist controversy 
between behaviorists and cognitive theorists is another 
expression of the different emphases they give to the 
functional analysis of environmental contingencies of 
reinforcement versus the invention of models of the 
organism’s internal machinery. The following passage 
briefly summarizes Chomsky’s supposedly nativist view 
of language learning: 


Consider an acquisition model... that uses 
linguistic data to discover the grammar of the 
language. . . . Just how the device. . . selects 
a grammar will be determined by its internal 
structure, by the methods of analysis available 
to it, and by the initial constraints that it ime 
poses on any possible grammar. If we are given 
information about the pairing of linguistic data 
and grammars, we may try to determine the 
nature of the device. (1972, p. 119) 


This passage appears to rest all the weight of syntactic 
development on genetically given characteristics of the 
organism. Yet in a footnote, Chomsky (1965, p- 202) 
acknowledged that the language-acquisition device's 
internal structure “might possibly be developed on 
the basis of deeper innate structure, in ways that de- 
pend in part on primary linguistic data and the order 
and manner in which they are presented.” ‘his 
sounds like learning, Finally, Chomsky drew a distinc- 
tion between “two functions of external data—the 
function of initiating or facilitating the operation of 
innate mechanisms and the function of determining 
in part the direction that learning will take” (1965, 
p. 34). 

The influence of the organism’s “deeper” internal 
structure on the grammar it “selects” is a matter of 
Kantian a prioris, the organism’s “inescapable charac- 
teristics and limitations.” Again it should be said that 
behayiorists (until quite recently) have been amiss in 
too seldom acknowledging genetic constraints and 
thus allowing the misconception to be broadcast that 
behaviorism denies their very existence. On the other 
hand, Chomskians have been derelict in too seldom 
mentioning Chomsky’s admission that the language- 
acquisition device’s internal structure and functioning 
might very well evolve under the influence of environ- 
mental learning contingencies. 

Chomsky’s distinction between two functions of ex- 
ternal data, that of engaging internal learning and 
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perceptual mechanisms and that of determining the 
behavioral direction of learning, corresponds approx- 
imately to Premack’s (1970) distinction between the 
organism’s perceiving environmental events and rela- 
tions and its learning to map its perceptions verbally. 
The search for “linguistic universals’’ through com- 
parison of culturally given “pairing[s] of linguistic 
data and grammars” then becomes, not a search for 
the unknowable noumena behind verbal behavior, 
but a search for functional correlations between en- 
vironmental contingencies and syntactic behavior— 
that is, a search for the general characteristics of 
verbal behavior “ultimately determined by the genetic 
and ecological universals of a species” (Brainé, 1971, 
p. 185). The search for linguistic universals, in short, 
is part of the search for general laws of behavior, and 
surely the physiological constitution of organisms 
helps to determine the form of such laws. Psycho- 
linguistic theories of linguistic performance modeled 
after abstract theories of linguistic competence antic- 
ipate physiological knowledge by proposing models of 
neurological functioning in advance of empirical 
neurological findings. As Salzinger (1975) has noted, 
behaviorists regard such anticipatory theories as no 
more necessary than physiological theories of learning 
in general, but there is ne attempt to proscribe 
physiologizing by those so inclined. 


Lashley’s Critique 


Lashley’s (1951) classic paper on the problem of 
serial order in behavior is often cited by cognitive 
theorists as posing an unanswerable challenge to a be- 
havioral analysis of language, but in fact Lashley’s 
paper (which is perhaps more widely cited than read) 
indicts, not behavioral analysis, but cimplictic 
physielggizing in terms of “concepts of the reflex arc, 
or of associated chains of neurons” (p. 526 in the 
Beach et al., 1960, reprinting). Skinner hag consistently 
refrained from such simplistic physiologizing, syen im 
his most theoretical discussions of syntax. Fo quote 
Lashley furthé#: 


Language presents in a most striking form the 
integrative functions that are characteristic of 
the cerebral cortex. . . . In spite of the ubiquity 
of the problem [of serial order] there have been 
almost no attempts to develop physiological 
theories to meet it. . . . I have chosen to discuss 
the problem of temporal integration here, not 
with the expectation of offering a satisfactory 
physiological theory to account for it, but be- 
cause it seems to me to be both the most im- 
portant and also the most neglected problem of 
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cerebral physiology. (in Beach et al., pp. 507-508; 
italics mine) 


In the remainder of the paper Lashley proceeds to 
describe the problem of syntactic behavior with a per- 
ceptiveness befitting a great psychologist. So far from 
conflicting with Skinner’s theory of autoclitic be- 
havior, Lashley’s description closely parallels it. (It is 
not clear which came first, inasmuch as Skinner’s Wil- 
liam James Lectures on verbal behavior were given at 
Harvard University in 1947 and widely circulated in 
mimeographed form thereafter. Questions of priority 
are hardly relevant in any case.) Continuing from 


Lashley: 


There are indications that, prior to the internal 
or overt enunciation of the sentence, an aggre- 
gate of word units is partially activated or 
readied. (p. 512) 


Compare Skinner: 


The important properties of verbal behavior 
which remain to be studied concern special ar- 
rangements of responses. Part of the behavior of 
an organism becomes in turn one of the variables 
controlling another part. . . . The events avail- 
able to him as stimuli consist of the products of 
his own behavior as speaker. He may hear him- 
self or react to private stimuli associated with 
vocal behavior, possibly of a covert or even in- 
cipient form. ... The term “autoclitic’ is in- 
tended to suggest behavior which is based upon 
or depends upon other verbal behavior. (1957, 
pp. 315-315; italics mine) 


Also from Skinner: 


The manipulation of verbal behavior, particu- 
larly the grouping and ordering of responses, is 

. autoclitic. Responses cannot be grouped or 
ordered until they have occurred or at least are 
about to occur. (p. 332; italics mine) 


Once more from Skinner: 


Much of the self-stimulation required in the 
autoclitic description and composition of verbal 
behavior seems to occur prior to even subaudible 
emission, In both written and vocal behavior 
changes are made on the spur of the moment 
and so rapidly that we cannot reasonably attrib- 
ute them to an actual review of covert forms. 
. . . Evidently stimulation associated with the 
production of verbal behavior is sufficient to en- 
able one to reject a response before it has as- 
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sumed its final form. The subject is a difficult 
one because it has all the disadvantages of 
private stimulation. (p. 371; italics mine) 


Now back. to Lashley: 


‘There are at least three sets of events to be 
accounted for. First, the activation of the expres- 
sive elements (the individual words . . .) which 
do not contain the temporal relations. Second, 
the determining tendency, the set, or idea... . 
Third, the syntax of the act, which can be de- 
scribed as an habitual order or mode of relating 
the expressive elements; a generalized pattern or 
schema of integration which may be imposed 
upon a wide range and a wide variety of specific 
acts.. ‘This is the essential problem of serial 
order: the existence of generalized schemata of 
action which determine the sequence of specific 
acts (p. 515). 


These passages from Skinner and Lashley pose the 
problems of syntactic behavior almost identically, 
except for vocabulary differences. It is unclear, there- 
fore, why cognitive theorists approve Lashley’s for- 
mulation and reject Skinner’s. Skinner carefully re- 
frained from speculating about the internal locus and 
dimensions of autoclitic processes, regarding the 
organism’s physiological machinery as outside his pur- 
view. ‘The whole extent of his “physiologizing,” if one 
should call it that, was to assign the status of covert 
“responses” and “response’-produced “stimulation” to 
the hypothetical controlling variables immediately 
responsible for the autoclitic processes that order 
verbal behavior into surface syntactic structures. Lash- 
ley, with his excellent physiological credentials, went 
much farther and translated the behavioral problems 
of serial order into problems and hypotheses for 
neurophysiology. Translations of problems from the 
vocabulary of one science into the vocabulary of 
another do not, of course, constitute explanations. So 
far as I know, the physiological problems Lashley set 
in 1951 have not yet found solutions. 


THE COMPLEMENTARITY OF FUNCTIONAL 
AND COGNITIVE THEORY 


The abiding problem is structure. As Lashley 
noted, problems of temporal structure are not con- 
fined to language, but exist everywhere in behavior. 
Research on operant behavior divides into two broad 
categories: research on the exteroceptive discrimina- 
tive stimulus control of operants (which has broad 
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though so far mostly unexploited relevance to the 
semantics of verbal behavior), and research on the 
determinants of temporal structure in schedule-gen- 
erated performances. Ferster and Skinner (1957) de- 
voted a large book to the hypothesis that the temporal 
structure of schedule-generated performances is at 
least partially to be explained in terms of stimuluslike 
processes generated by the organism’s own behavior. 
As they put it, “The primary purpose of the present 
book is to present a series of experiments designed to 
evaluate the extent to which the organism’s own be- 
havior enters into the determination of its subsequent 
behavior” (p. 13), Although they did not say so, it 
seems possible that Skinner’s hypothesis of autoclitic 
processes in syntax motivated Ferster and Skinner's 
choice of a guiding hypothesis for their extended 
experimental analyses of schedules of reinforcement. 

Research on the temporal structure of operant be- 
havior continues. Recently, Hawkes and Shimp (1975) 
succeeded in demonstrating that differential contin- 
gencics of reinforcement imposed directly on temporal 
structure were effective in generating the prescribed 
structure in the key pecking of pigeons. ‘They shaped 
an “ideal” fixed-interval-like scallop by delivering a 
reinforcer at the end of a 5-sec trial only if the pattern 
of responding during the trial approximated constant 
acceleration. Research on higher-order schedules of 
reinforcement (e.g., Findley, 1962; Kelleher, 1966) has 
demonstrated that hierarchically organized behavioral 
structures, too, are within the operant purview. It is 
not too much to hope that further research on higher- 
order schedules will illuminate not only how the 
structured nonverbal performances generated by first- 
order schedules become organized into higher-order 
(hierarchical) structures, but also how the hierarchical 
structure of verbal behavior comes about. 


The Problem of Yerbal Structure Stated 
in Functional Terms 


Let us review the problem. The string of verbal re- 
sponses that makes up the surface of a kernel sentence 
is hierarchically structured. Kernel sentences can be 
parsed into two broad constituents, a “subject” and 4 
“predicate.” In the kernel sentences of English, the 
responses constituting the subject constituent are 
uttered first and responses constituting the predicate 
constituent second. Speakers must learn these two 
temporal positions within the sentence. There is evi- 
dence (Brown, 1973) that children just entering the 
mysteries of syntax have in fact learned them (or are 
in process of learning them). That is, children in 
“stage 1” of syntactic development, when two-word (or 
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two-morpheme) utterances first appear, tend to utter 
an “‘uninflected” unit response interpretable as “sub- 
ject” followed by an uninflected unit response inter- 
pretable as “predicate.” (To be sure, the interpreta- 
tions are generous, but they seem to correspond to the 
way parents respond to their children’s utterances.) 

But the subject “noun phrase” can itself be parsed 
into an (optional) “determiner” (the, a, and so on) 
and an (obligatory) “noun” or “pronoun.” In kernel 
sentences of English, the responses constituting the 
determiner (if there is one) are uttered first within the 
subject phrase, and responses constituting the noun 
arc uttered second. Speakers must learn these two 
temporal positions within the subject constituent; that 
is, they must learn the “internal structure” of the sub- 
ject phrase. 

The predicate “verb phrase,” too, can be parsed 
into an “auxiliary” (at the least, a “tense marker’) 
and a “verb.” In simple kernel sentences of English, 
the responses constituting the verb are uttered first 
within the predicate phrase, and the responses con- 
stituting tense are uttered as a suffix inflection on the 
verb, Speakers must learn these two temporal posi- 
tions within the predicate phrase; that is, they must 
learn the “internal structure” of the verb phrase. (In 
standard transformational-generative grammar, the 
tense marker—and auxiliaries in general—appears be- 
fore the verb in deep structure. It is moved to verb 
suffix position in surface structure by a transforma- 
tion.) 


Braine’s Functional Experiments on Structure: 
Tamporal Position ac a Controlling Variable 


The hierarchical structure so far described can be 
generated by a “binary fractionation” model of phrase 
Structure (Brains, 1963b), First, a “sentence” is 
divided into two “fractions,” subject and predicate, 
and then each 5f thesé twe fractions is diyided into 
twe fractions, subject becoming determiner plus noun, 
predicate becoming verb plug auxiliary. Braine pre- 
sétits some sketchy evidence sugecsting that speakers 
may learn syntactic structure of natural languages by 
iterative binary fractionations rather than by fraction- 
ations into, say, thirds of a phrase. That is, his experi- 
ments (to be reviewed shortly) suggest that speakers 
find it easier to learn the ‘absolute’ temporal posi- 
tions of first vs. last than “relative” temporal positions 
such as “middle” or “after the first but before the 
last.” Moreover, Braine suggests that the learning of 
temporal positions in a sentence may be a species of 
auditory perceptual learning, speakers learning which 
verbal responses “sound right’ in first position and 
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which “sound right” in second position. Having 
learned which responses sound right in first (subject) 
position and which sound right in last (predicate) 
position, speakers then proceed to learn which re- 
sponses sound right in first vs. last position within the 
subject phrase and which sound right in first vs. last 
position within the predicate phrase. Thus the learn- 
ing of hierarchical structure might unfold a binary 
fractionation at a time. 

Braine’s (1963a) ingenious experiments involved 
the echoic or textual teaching of miniature artificial 
languages (devoid of ‘‘semantic content”) to children 
between the ages of 4 and 11 years. In Experiment I 
(A&P language with word constituents), the experi- 
menter presented “sentence frames’ with either the 
anterior (A) or posterior (P) position already filled by a 
nonsense-syllable word from the A-class or the P-class, 
respectively. ‘The experimenter also presented two 
choice words, one from the A-class and one from the 
P-class, from which the children had to choose one 
word to fill in the blank position in the sentence 
frame. Correct completions were reinforced with 
poker chips backed up with candy; incorrect comple- 
tions were simply corrected by the experimenter. In 
either case, a learning trial terminated with the sub- 
ject, and then the experimenter, reading aloud the 
correctly completed two-word sentence. In this way, 
the children learned what words constituted the A- 
class (what words were permissible completions of a 
blank in first position) and what words constituted the 
P-class (what words were permissible completions of a 
blank second position). As a test of generative syntax, 
the experimenter then presented sentence frames in 
which a novel A- or P-word occupied first or last posi- 
tion, respectively, and the child had to complete the 
frames by choosing the correct one of two offered 
words, one a familiar A-word and the other a familiar 
P-word. Because the supplied words in the test sen- 
tences were novel, they provided no intraverbal cues 
to correct completions. The only available cues, then, 
were the positions of the blanks in the test sentence 
frames. Subjects 9 and 10 years old completed 78% of 
the test problems correctly, showing that they had 
learned the A- and P-classes on the basis of temporal 
position within the sentence and that temporal posi- 
tion was a sufficient intraverbal stimulus controlling 
productive choice of an A- or P-word. In a replication 
of the experiment adapted for 4-year-old children 
(Experiment V), 75% of the test sentence frames were 
completed correctly, showing that the intraverbal 
learning called for was within the reach of younger 
children. 

In another experiment (Experiment IV), 9- and 10- 
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year-old children learned the same artificial language 
in the same way, except that before making their final 
completion choices for each sentence frame the chil- 
dren were required to say aloud both sentence com- 
pletions, the incorrect one produced by putting an 
A-word in a blank P-position or a P-word in a blank 
A-position, and the correct one produced by putting 
an A-word in A-position or a P-word in P-position. 
Then the children were permitted to make their own 
completion choices, and they were told whether their 
completions were correct or not, but they were not 
permitted to read their final completions aloud. By 
this procedure Braine attempted to equate auditory 
exposure (actually, echoic practice) with correct and 
incorrect sentences. This experiment did not include 
a test of generative syntax. Nevertheless, the results 
from the learning trials themselves were instructive. 
Whereas all children in the standard Experiments I 
and V learned to complete the training sentences cor- 
rectly within a median of 10.5 or 13 trials, respec- 
tively, and while making a median of only 4 or 4.5 
errors, respectively, some children in Experiment IV 
did not meet the learning criterion within the allotted 
number of trials (50), the median trials to criterion 
was 32, and the median errors to criterion was 9. 
Braine concluded that in Experiments I and V “the 
relevant cue was the temporal position in the spoken 
sentence, and as learning progressed words . . . came 
to sound familiar in the positions in which they re- 
curred. The subjects [of Experiments I and V] were 
then able to respond correctly in generalization prob- 
lems by picking the alternative which made the sen- 
tence [literally] ‘sound right’ ”’ (1963a, p. 335). 
Another experiment (Experiment II: A&P language 
with phrase constituents) investigated whether learn- 
ing would be impaired if the response elements con- 
stituting A- and P-classes were sometimes two-word 
phrases instead of single words. With one group of 9- 
and 10-year-old children, a whole (one- or two-word) 
phrase was left blank in the sentence frame, and chil- 
dren completed the frame by choosing between a one- 
or two-word A-phrase and a one- or two-word P- 
phrase. (No cues indicated whether the frame called 
for a one- or two-word completion.) With a second 
group of subjects, only the second word of a two-word 
A-phrase was missing in the sentence frame, or only 
the first word of a two-word P-phrase was missing. 
Thus the completed sentences were two, three, or four 
words long, and the blanks were in first, second, or 
third position in the frame. In this group, then, 
temporal position controlling the formation of A- 
classes and P-classes and cuing correct sentence com- 
pletions was not “‘absolute’’ first vs. last position (with 
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phrase constituents), but “relative” position (a P-word 
was called for after an A-word or phrase and before 
the last word of the P-phrase, or an A-word was called 
for before a P-word or phrase and after the first word 
of the A-phrase; the missing constituents were words, 
not phrases). Subjects in the first group, learning 
whole-phrase constituents under control of absolute 
first or last position in the sentence, performed as well 
as subjects in the standard Experiments I and V, who 
learned single-word constituents in absolute first or 
last position. However, subjects in the second group, 
learning part-phrase (word) constituents under control 
of relative temporal position in the sentence, showed 
markedly inferior performances, The results of this 
experiment led Braine to conclude that absolute first 
vs. last position is more easily learned than relative 
position, and therefore that the learning ot syntactic 
structure should proceed more easily by iterative 
binary building up of a sentence into its hierarchical 
structure. 

In the remaining experiment of this series (Experi- 
ment IJ: A&xPQ and AB&P languages), Braine (1963a) 
investigated whether comparably simple procedures 
would enable 9- to 11-year-old subjects to learn the 
“internal structure” of phrases. Half the subjects 
Jearned the A&PO language, and half learned the 
AB&P language, its mirror image. Only the A&PQ 
language will be described. The A-class in thig lan- 
guage consisted of one- or two-word phrases, but cach 
phrase was invariant in composition, so that terms in 
the A-class had no internal structure. The PQ-class 
consisted of two subclasses, the p-class and the q-class. 
Any word from the p-class could be combined with 
any word from the q-class, yielding internal structure 
piqi: A sentence consisted of the following sequence: 
(phrase from the A-class) [(word from the p-class) 
(word from the q-class)]. During learning, children 
completed sentence frames by filling in a whole A- 
phrase or a whole PQ-phrase (of varying internal 
structure). Next, a “between-phrase”’ generalization 
test was given. Sentence frames were presented con- 
taining novel A-phrases (with the PQ-position blank) 
or novel PQ-phrases (with the A-position blank), and 
children were offered a choice between a familiar A- 
phrase and a familiar PQ-phrase to complete the 
frame. Again, the novelty of the supplied A- and PQ- 
phrases precluded the children’s completing the be- 
tween-phrase generalization sentences on the basis of 
simple intraverbal phrase cues. Only the position of 
the blank in the sentence frame indicated which 
choice was correct. After the between-phrase general- 
ization test children practiced again completing famil- 
lar sentence frames with familiar A- and PQ-phrases. 
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Then the children were given a “within-phrase” gen- 
eralization test. Sentence frames were presented con- 
taining a familiar A-phrase and a novel p- or q-word. 
The frames thus lacked only one word (in p- or q- 
position). Subjects were given three, instead of two, 
familiar alternative words to choose from (“because 
there were three parts of speech’—Braine, 1963a, p. 
331), a familiar A-word (or phrase), a familiar p-word, 
and a familiar g-word. 

All subjects in Experiment III reached the initial 
learning criterion in a number of trials comparable to, 
and with fewer errors than, the subjects in Experi- 
ments I, I], and V. In the between-phrase generaliza- 
tion test they completed 74% of the novel test frames 
correctly, and in the within-phrase generalization test 
they completed 6897 of the novel test frames correctly, 
percentages comparable to those obtained in the gen- 
cralization tests of Experiment I. Taken together, 
these results showed that children learned concur: 
rently both the A- and PQ-phrase structure of the 
sentences (the “highér-order" structure corresponding 
to subject-predicate) and the internal structure of the 
PQ-phrases (the ‘lower-order” structure corresponding 
to the internal structure of a verb phrase, say). Both 
words within phrases and phrases within sentences 
“tend to become associated with the sentence positions 
in which they recur, .. . [and] within fairly wide 
limits, the constitution of the elements in first and 
last position is not an important variable for either 
learning or generalization” (Braine, 1963a, p. 333). 

Braine summed up the results of these five cxperi- 
ments as follows: 


(a) “What is learned” are the locations of expres- 
sions in utterances. (b) Units (i.e., expressions 
whose position is learned) can form a hiérarchy 
in which longer units contain shorter units as 
parts, the lecation that is learned being the loca- 
tion of a unit within the next-larger containing 
unit, up to the sentence. (c) The learning is a 
case of perceptual learning—a process of becom- 
ing familiar with the sounds of expressions in 
the positions in which they recur. (1963a, p. 337) 


The learning of the temporal positions of words and 
phrases, moreover, appears to be 


a process of auditory differentiation. . . . Per- 
ceptual learning is usually assumed to be a 
rather primitive process and there is therefore 
no reason to suppose that it demands much in 
the way of intellectual capacity in the learner. 
Learning of this sort would therefore satisfy at 
least one requirement of any process postulated 
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to be involved in first language learning, namely, 
that it not require intellectual capacities obvi- 
ously beyond the reach of the 2-year-old. (1963a, 
p. 326) 


“Primitive process’ or not, the temporal control of 
behavior is familiar territory to operant behaviorists 
(Catania, 1970; Dews, 1970; Jenkins, 1970; Killeen, 
1975; Morse, 1966; Staddon, 1972; Weiss, 1970). What 
is perhaps less familiar is Braine’s important insight 
that temporal position in a sentence constitutes a 
strong intraverbal variable controlling syntactic word 
order. In a later experiment Braine (1965a) showed 
that middle temporal position in a spoken sentence 
string (aXb, pXq) also serves as an intraverbal vari- 
able controlling the formation of a response class (of 
X-words). In this case, however, temporal position was 
not the only controlling intraverbal stimulus, for 
X-words generalized to novel a__bandp__q contexts, 
but not (to any considerable extent) to ‘anomalous’ 
a__qandp_ _b contexts. In other words, the intra- 
verbally linked classes a__b and p__q combined with 
middle position to determine the generative appear- 
ance of X-words. This kind of combined control 1s 
what Skinner (1957) termed “multiple causation” and 
which is otherwise familiar to behaviorists as condi- 
tional discrimination.? 

In still other experiments, Braine (1971) pursued 


21 think all of Skinner’s (1957) discussion of the multiple 
causation of verbal behavior can be most easily understood in 
terms of conditional discriminations. Conditional discrimina- 
tions, reflecting the concerted influence of at least two dis- 
criminative stimuli, have the general form: Given stimulus 4, 
response 1 will be reinforced conditionally on the presence of 
stimulus B, but response 2 will be reinforced conditionally on 
the presence of stimulus C. Then, knowing that stimulus A was 
present, one could predict that either response 1 or response 2 
would be emitted, but not which one. Knowing, additionally, 
that stimulus B was present, one could more confidently predict 
response 1, while knowing that stimulus C was present (rather 
than B), one could more confidently predict response 2. For 
example, if stimulus A were “Give me a word beginning with 
m,” then the response “Marie” would be likely in the presence 
of the additional instruction, “which is the first name of a 
scientist who received Nobel Prizes in both physics and 
chemistry”; but the response “mother” would be likely in the 
presence of the additional instruction, ‘which rhymes with 
brother.” 

The point about multiple causation is that any single con- 
trolling variable typically raises the probability of a whole class 
of verbal responses, as “Give me a word beginning with m” by 
itself raises the probability of “Marie,” ‘‘mother,” and all other 
verbal responses, beginning with m, that happen to be current 
in the speaker’s repertoire. Lacking other information, it would 
be impossible to predict which response from the class a speaker 
would utter. But if one knew all the variables controlling each 
response in a speaker’s verbal repertoire, it would be theoretically 
possible to predict exactly what the speaker would say next on 
any occasion, as “Give me a word beginning with m—and 
rhyming with brother” zeroes in on just a single item. 
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the intraverbal control of syntactic structure in more 
elaborate artificial languages (which were still, it 
should be noted, devoid of “‘semantic content” or, in 
other words, contributions from the nonverbal stim- 
ulus control that defines the tact). He instructed adult 
subjects to echo to spoken strings of nonsense words 
having the structures p(Af)(Bg) and (Bg)q(Af)r, where 
p marked a sentence type (analogous, say, to a declara- 
tive sentence marker), and q_ r marked a different 
sentence type (analogous, say, to an interrogative 
sentence marker—“Wh _ ” words, for example). (Af) 
and (Bg) were sentence constituents with internal 
structure, f and g being fixed elements (analogous to 
Braine’s earlier—e.g., 1963b—notion of a “pivot” 
class) and A and B being word classes with 
several members each (analogous to Braine’s 1963b 
notion of an “open” class). In this more elaborate 
language, the internal structure of elements within Af 
and Bg phrases was presumably controlled by 
temporal position within the phrases, but the larger 
structure of constituents within the sentence was pre- 
sumably controlled jointly by temporal position with- 
in the sentence and by the more familiar kind of 
intraverbal stimulus, the “word” elements p and q__r. 
(Note that q__r forms a “discontinuous” intraverbal 
stimulus—Braine, 1965b.) 

An interesting feature of Braine’s (1971) experiment 
with the more elaborate language is that he induced 
one group of subjects to echo not only “well-formed” 
strings but also ‘‘anomalous” strings. One type of anom- 
alous string consisted of “first-order approximations” to 
the grammatical language (random orders of up to six 
elements, where, over the set of anomalous strings, the 
frequency of each element was proportional to its fre- 
quency in the well-formed corpus). A second type of 
anomalous string consisted of “third-order approxi- 
mations” to the language (strings of 3-11 elements 
consisting of “running triads’ from the well-formed 
sentences of the language). ‘Together these two types 
of anomalous strings made up 7% of all the strings to 
which the experimental subjects were exposed (and 
which they were asked to echo). Control subjects were 
exposed to (and asked to echo) only well-formed 
strings. Recognition tests, sentence-completion tests, 
and “word-association” (“lexical class’) tests (asking 
the subject to pick out from a written vocabulary list 
all the words that “went with” f and all words that 
“went with” g) indicated that experimental subjects 
had learned the syntactic structure of the language as 
well as had control subjects, in spite of the diversion- 
ary anomalous strings which experimentals had been 
required to echo. ‘The immunity of the experimental 
subjects to deleterious effects of echoing a corpus that 
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included 7% anomalous strings led Braine to suggest 
that the “degenerate” or ill-formed sentences and 
sentence fragments to which children may be exposed 
in the course of learning their natural languages need 
not be thought necessarily to interfere with the 
development of control over children’s syntactic be- 
havior by the regular intraverbal variables in the 
linguistic corpora to which they are exposed (and 
which they probably echo, at least in part and 
covertly). He suggests, quite plausibly, that the intra- 
verbal regularities (both those of temporal position 
and those of intraverbal word linkages) are more than 
enough to override the distraction of ‘degenerate”’ 
speech samples. 


Intraverbal Contributions to Syntax 


Braine’s (1971) paper is worth close study for the 
further plausible suggestions he makes concerning the 
course of development of intraverbal syntactic control, 
including the control of inflections. For example, he 
notes that some intraverbal regularities may be more 
frequent, more consistent, and simpler, and hence 
may more quickly acquire a controlling role in the 
child’s syntactic behavior than do other intraverbal 
variables. If that is so, the aspects of syntactic behavior 
controlled by these variables should appear earlier in 
children’s speech, resulting in what Whitehurst and 
Vasta (1975) have called “‘selective imitation.” More- 
over, these intraverbal variables might provide a sort 
of “seed” around which more complex intraverbal 
control could “grow,” rather like a crystal (see Robin- 
son, Chapter 21 in this volume). That is, once the 
children were responding reliably to the simpler intra- 
verbal variables, they might begin to come under the 
control of further intraverbal regularities between 
them and more subtle intraverbal stimult, 


Autoclitic Contributions ta Syntax 


Intraverbal control cannot provide a complete ac- 
count of syntactic structure, however powerful it 
proves to be as an explanatory principle. As Braine 
(1963a) noted, for example, it will not account for 
“contrastive’” word order. For example, there is 
nothing in the intraverbal variables to account for the 
contrastive word orders of ‘‘Boy bites dog” vs. “Dog 
bites boy.” The nonverbal variables controlling the 
tact are importantly involved here, and it is here that 
the autoclitic hypothesis becomes most useful. By 
autoclitic hypothesis I mean the hypothesis that ex- 
ternal nonverbal and verbal stimuli combine with 
private “stimuli” generated by covert or incipient 
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verbal responses to determine a public verbal re- 
sponse. (Strictly speaking, only the private variables 
are the autoclitic contribution to syntactic verbal be- 
havior, but the full power of the autoclitic concept 
depends on multiple causation—the combined action 
of many determinants at once, only one of which is 
privately generated.) 

Segal (1975b) pointed out the approximate corre- 
spondence between the grammatical account of syntax 
in terms of deep structure vs. surface structure and 
the behavioral account in terms of verbal primitives 
vs. autoclitics. According to transformational-genera- 
tive theory, the terms in deep structure are not un- 
ordered strings but rather are structured in ways that 
reflect basic grammatical relations (subject-predicate 
and so on), basic semantic relations (actor-action and 
so on), and complex interrelations among these (e.g., 
semantic-syntactic constraints on ‘animate’ grammat- 
ical-semantic subjects with ‘inanimate’ grammatical- 
semantic objects, such as those proscribing the string 
“The boy may frighten sincerity’). According to auto- 
clitic theory, also (Skinner, 1957, pp. 332-333), there is 
some basic order among the verbal primitives. This 
“primitive order” reflects variously (1) phonological 
processes responsible for meaningful sequences of 
phonemes; (2) intraverbal order (A ‘train of thought’ 
in free association follows the ordér in which verbal 
stimuli evoke other verbal responses’—Skinner, 1957, 
p- 333); (3) the overall, long-standing probability of 
yerbal responses in the speaker’s repertoire (perhaps, 
but not necessarily, mirroring the relative frequencies 
of “words” in the “language’); (4) the momentary 
probability of responses reflecting (a) the momentary 
salience of particular environmental variables, (b) the 
temporal order of the environmental variables (what 
Robinson, in his chapter in this volume, calls “iconic” 
structure), and (c} the momentary (contingent) ar 
rangement of environmental variables into specific 
combinations and _ relations (other than sheerly 
temporal arrangements), 

This first-order structure, supplied by the overall 
organization of the speaker's verbal repertoire (items 
1, 2, and 3, above) and the structure of the environ- 
ment itself (“semantic” items 4a, b, and c, above), 1s 
still “‘agrammatical.” The grammatical structuring 
occurs when the primitively structured responses call 
into play autoclitic processes. Skinner would say the 
verbal primitives function as verbal stimuli evoking 
intraverbal (autoclitic) responses. Consider “Boy bites 
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dog”: 


1. A semantic relation among the events in the en- 
vironment (the boy is doing something), function- 
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ing as a second-order tact variable, in combination 
with the covert availability of the verbal primitive 
(tact) “boy,” functioning as one kind of intraverbal 
variable, and position within the utterance, func- 
tioning as another kind of intraverbal variable, 
together determine the ordering of the response 
“boy” in first (subject) position in “deep structure.” 


2.'I'wo additional semantic relations among the 
events in the environment (the actor of the action 
is singular, the action is occurring now), function 
as second-order tact variables jointly evoking -s. 


3. The covert availability of two verbal (tact) primi- 
tives bite and -s, functioning as intraverbal vari- 
abies, in combination with the semantic relations 
just mentioned (the actor is singular, the action is 
ongoing), functioning as tact variables, determine 
the autoclitic ordering of the primitives as bites. 
(This step is separated from step 2 because, one 
must suppose, -s cannot be ordered until it is avail- 
able. Nevertheless, the semantic relations men- 
tioned in step 2 as controlling the availability of -s 
must be mentioned again in step 3 to account for 
the fact that -s gets suffixed to bite and not to boy 
or dog.) 

4. The covert availability of the autoclitically ordered 
“response pair” bites, functioning as an intraverbal 
variable, in combination with the semantic relation 
between the boy and the biting, functioning as a 
second-order tact variable, and with temporal posi- 
tion in the utterance, functioning as another sort of 
intraverbal variable, together determine the order- 
ing of the response pair bites in second (predicate 
verb) position in the utterance. 


5. The semantic relation between the biting and the 
dog, functioning as a second-order tact variable, 
together with the covert availability of the tact 
primitive dog, functioning as an intraverbal vari- 
able, and temporal position within the utterance, 
functioning as another intraverbal variable, to- 
gether determine the ordering of the response dog in 
third (predicate object) position in the utterance. 
The final result is “Boy bites dog.” (I have not 
included determiners in this analysis. For a percep- 
tive discussion of the complex combination of in- 
traverbal and tact variables controlling the English 
articles the and a, see Brown, 1973). 


The Isomorphism of Autoclitic and 
Transformational Theories 


At what point does deep structure turn into surface 
structure? For the functional behaviorist, the answer 
is simply when the responses are uttered publicly. 
Transformational-generative grammarians cannot ac- 
cept such an answer, however, because for them the 
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terms in deep structure are not covert verbal re- 
sponses. They are dummy terms representing semantic 
and syntactic “features.’’ For cognitive theorists, these 
features are simple “ideas” in the modality (whatever 
it may be) of mentalese. The particular collection of 
features that makes up a single dummy term (or “‘com- 
plex symbol” in Chomsky, 1965) represents a first- 
order complex “‘idea,”’ and the organization of dummy 
terms that makes up a single deep structure repre- 
sents, so to say, a higher-order complex “idea.” Func- 
tionally these features can be construed as the inde- 
pendent variables (nonverbal and verbal) that deter- 
mine the tacts and intraverbals of the autoclitic 
analysis, and in the autoclitic analysis these variables 
directly determine verbal responses (although the 
determination: is complex). But in grammatical 
theory, even if the uttered sentence is a kernel sen- 
tence, certain basic transformations are called for be- 
fore the dummy terms (ideas) in deep structure be- 
come realized as words. The speakers must dip into 
their “‘lexicon”’ to find ‘“‘formatives” that match exactly 
the particular collections of semantic and syntactic 
features assigned to the dummy terms in deep struc- 
ture. Then the speakers must dip into their “morpho- 
phonemic rule book” to transform the formatives into 
their realized sound sequences, words. 

Despite the different ways behaviorists and gram- 
marians talk about these matters, it seems that they 
equally recognize the same complex set of variables, 
and equally recognize the necessity for a complex 
analysis of how the variables combine to determine 
utterances. I do not think it matters a great deal 
whether one employs the terms “deep structure” and 
“surface structure,” or the terms “primitive verbal 
responses’ and “autoclitically modified verbal re- 
sponses’ to describe an important distinction between 
controlling variables in the external environment and 
controlling variables generated by covert processes 
within the speaker. I do not think it matters a great 
deal whether one assigns “abstract semantic and syn- 
tactic features” to “dummy terms in deep structure” 
(which must then be transformed through “lexical 
and morphophonemic rules” into ‘words’), or 
whether one regards the “nonverbal and intraverbal 
variables” as determining the final, uttered string of 
“verbal responses” through one continuous (but com- 
plex) “autoclitic behavioral process” occurring in real 
time. Both the behaviorist’s and the grammarian’s ac- 
counts of utterances are hypothetical (and meta- 
phoric), and yet nothing less than their intricate, but 
isomorphic, hypothetical accounts seems to account 
for the complexities of syntax. 

MacCorquodale (1969) briefly summarized the 
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theoretical assumptions underlying autoclitic theory. 
As he put it, the plausibility of the theory 


depends upon one’s being able to accept the no- 
tion that a speaker can respond discriminatively 
to (1) what he is about to say; . . . (2) why he is 
about to say it; and (3) how strong the [covert 
or incipient] operant is. (p. 840; italics his) 


MacCorquodale went on to argue: 


‘The discriminations concern complex relations 
between speech and its causes, and they are very 
rapid, In this respect it is important not to re- 
lapse into conceiving of discrimination as a 
separate prebehavioral act. Ordering zs discrimi- 
native behavior, not the result of it, so that the 
complex discriminations in auteclitic behavior 
need not be allotted prebchavioral time... ., 
The situation that strengthens the tacts the, bey 
and runs also contains the relation that deter- 
mines the order of their emission as the boy runs. 
If I am correct in this, auteclitic behavier is not, 
strictly speaking, controlled by other behavior, 
but by other operants. There is a difference. (p. 
840; italics his) 


In the second passage, as I understand it, Mac- 
Corquodale is suggesting that we regard as merely 
metaphoric Skinner’s references to incipient responses 
and stimulation arising from incipient responses. 
These metaphors simply represent the combined effects 
of multiple tact and intraverbal variables controlling 
syntactic behavior. 

“Prebehavioral time” may or may not reflect 
psychological reality, but it is net an issue distinguish- 
ing an autoclitic account of syntactic behavior from 
any Other account of the complex three-term con- 
tingencics determining operant behavior. There need 
be no more (or less) “prebehavioral time’ involved in 
the temporal structuring of syntactic behavior than in 
the temporal structuring of, say, the fixed-interval 
scallop. Nevertheless, in both syntax and the fixed- 
interval scallop, the organism’s earlier behavior is said 
by Skinner to function as a controlling variable con- 
tributing to the determination of succeeding behavior. 
If intraverbal variables, and not only tact variables, 
control autoclitics, then, metaphoric or not, incipient 
verbal responses seem to play an important role in the 
theory of the autoclitic. 


An Appraisal of the Autoclitic 


The autoclitic is an orphan nobody wants. Be- 
haviorists have not wanted to claim it because they 
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(rightly, I think, despite MacCorquodale’s disclaimer) 
sensed its cognitive tendencies, Cognitive theorists 
have not wanted to claim it because of its behavioristic 
parentage—and for another reason, it must be said: 
they have a richer, more fully developed version of the 
same idea in the theory of transformational-generative 
grammar. ‘The autoclitic was a serious attempt to 
grapple with the difficult problems of syntactic struc- 
ture and the evident need to distinguish between 
something akin to deep structure and something akin 
to surface structure, while remaining as close as the 
problem permitted to the concepts and language of 
behaviorism. Verbal behavior is operant behavior, a 
product of the three-term contingent relation among 
behavior and antecedent and consequent environmen- 
tal events. It turns out, though, that a full description, 
functional or cognitive, of syntactic verbal behavior 
requires the postulation of hypothetical processes 
within the organism, mediating between environmen: 
tal “input” and response “output.” There is simply 
no gaimsaying this. 

Perhaps it is time fer the hypethesis of the aute- 
clitic to give way to more sophisticated analyses of 
syntactic processes. Nevertheless, itg place in the his- 
tory of the psychology of language is an honorablé 
one. Its lasting contribution is the insistence that syn- 
tax is the result of a complex blend of variables, at 
least some of which (the variables determining tact 
and intraverbal responses) have familiar er cencery- 
able dimensions. It may be that current attempts 
within linguistics (associated with the term penerative 
semantics) to imcorperate semantic variables within 
the determinants of deep structure may profit from a 
study of Skinner’s perceptive theorizing in the auto- 
cliuc framewerk, “The speaker 15 the organism which 
engages in or executes verbal behavior. He is also a 
locus—a place in which a number of variables come 
together in a unique confluence to yisld an equally 
unique achievement” (Skinner, 1957, p. 313). 


PARAPHRASE, THE PROBLEMATIC 
LISTENER, AND MENTALESE 


Skinner’s functional account of verbal behavior was 
deficient in its neglect of the listener. Verbal Be- 
havior contains few serious references to the problems 
of verbal comprehension, and what references there 
are are mostly unsatisfactory. ‘To a large extent Skin- 
ner shrugged off problems of accounting for how 
listeners learn to understand verbal stimuli as ordi- 
nary problems of discrimination, amenable to simple 
operant analysis and not requiring the special treat- 
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ment he devoted to the behavior of the speaker. 
“Much of the behavior of the listener has no resem- 
blance to the behavior of the speaker and is not ver- 
bal according to our definition. . .. The behavior 
of a person as listener is not to be distinguished from 
other forms of his behavior’ (Skinner, 1957, pp. 
33-34). 

If the verbal stimuli to which the listener responds 
are extremely simple, if their temporal arrangement 
is “iconic” (Robinson, chapter 21 in this volume), or 
if the listener’s only role is as an assiduous supplier 
of goods and services for the speaker’s pleasure (his or 
her role as reinforcement mediator), this account is un- 
objectionable. Problems arise, however, when the 
listener is called upon to paraphrase, to translate 
freely from one language into another, or to under- 
stand “a fairly difficult paper ...in the field of 
scientific and philosophic discourse’ (Skinner, 1957, 
p. 278). Here the listener becomes, for Skinner, an- 
other speaker. The listener can properly say he or she 
understands a difficult verbal passage, having complex 
syntactic structure, “only when he can emit corre- 
sponding behavior such as might occur... in re- 
sponse to nonverbal or intraverbal stimult” (1957, pp. 
277-278; italics mine). The problem of paraphrase is 
central, then. And Skinner does not handle the prob- 
lem of paraphrase very satisfactorily, it seems to me. 
The listener can paraphrase 


only after he has identified the variables which 
were mainly effective [in evoking the original 
Speaker’s utterance]... . 

It is .. . difficult to say what happens when 
[a person paraphrases or] listens to a passage in 
one language and restates it in another. The case 
is often offered as showing the need for some 
such concept as “idea” or “proposition,” since 
something common to two or more languages 
[or, in the case of paraphrase, two or more utter- 
ances in the same language] appears to account 
for their interchangeability. ... ‘To say that 
[the listener] emits behavior which is controlled 
by the variables which he infers to have been 
responsible for [the original speaker’s verbal 
behavior]... is... elliptical. (1957, pp. 280, 
78; italics mine) 


It is indeed elliptical. Skinner wanted to avoid 
positing what Pylyshyn (1973) has called “abstract 
propositional knowledge” and Fodor et al. (1974) have 
called “formulas in mentalese.” This is what “lies be- 
hind” and “motivates” deep structure in cognitive 
grammatical theory. Rather evasively, Skinner sug- 
gests that 
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verbal behavior in one language may give rise to 
private events within the individual which he 
may then describe in another language. . . . In 
giving the gist of what one has read in a book or 
heard someone else describe, in the same or a 
different language, the speaker is often con- 
cerned with generating behavior having the 
same effect upon himself. 


[The listener-speaker] tries out a [paraphrase], 
comparing the effects of the two versions upon 
himself and changing the [paraphrase] until the 
effects are roughly the same. But this does not 
account for the behavior which he thus com- 
pares. (1957, pp. 198, 78). 


The “behavior which he thus compares” is the 
listener’s reaction to another’s utterance and the 
listener's reaction to the self-generated paraphrase, but 
the nature of this listener reaction is never specified, 
beyond the suggestion that it may consist of “private 
events.” In other words, listeners infer the speakers’ 
“semantic intentions” or “deep structure’ or the 
“variables controlling the speaker’s utterance’ by 
consulting their own private responses to the utterance. 
From these, somehow, the listener constructs a para- 
phrase having the same private effects on themselves. 
To the extent that the listeners succeed in this, they 
may be said, “elliptically,’”’ to have inferred the vari- 
ables that controlled the speakers’ utterances. 

Consider a concrete example. Premack (1970) de- 
scribed a match-to-sample experiment with his chim- 
panzee, Sarah, in which she was required to give a 
“features analysis’ of an apple. With the apple present 
as sample, she had to choose, between pairs of com- 
parison stimuli, those comparison stimuli that roughly 
matched the visual features of the apple. The pairs of 
comparison stimuli were: “a red plaque vs a green 
one; a square plaque vs a round one; a square plaque 
with a stem-like protuberance vs a plain square one; 
and a square plaque with protuberance vs a plain 
round one” (p. 123). After Sarah had given her fea- 
tures analysis of the apple, she was required to do an- 
other features analysis, but this time with the word 
apple rather than a real apple as sample stimulus. 
The two features analyses were identical (Premack, 
1970, Table 1, p. 124). With the word as sample stim- 
ulus, she chose between pairs of comparison stimuli 
those features that match the visual features of an 
apple, not the visual features of the piece of blue 
plastic that functioned in Sarah’s language as the 
word “apple.” One might say Sarah “paraphrased” 
the word “apple” by indicating some of the physical 
features of an apple. 
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The only account I can offer of Sarah’s success at 
this task is to suppose that the apple and the word 
apple evoked similar private “listener reactions” in 
Sarah. Perhaps they both evoked a private visual im- 
age of an apple, or perhaps both activated some more 
“abstract propositional” form in which her “knowl- 
edge” about apples was ‘‘stored.” Many cognitive 
theorists (e.g., Fodor et al., 1974; Pylyshyn, 1973) sug- 
gest that central “storage” of “abstract propositional 
knowledge”—perhaps more or less directly in a form 
representing a features analysis—is more plausible 
than to suppose that all such instances of “para- 
phrase” are based on perceptual imagery in some 
covert peripheral modality. 

Experiments such as Premack’s (1970) seem to offer 
strong justification for regarding the terms underlying 
deep structure as having some more abstract status 
than covert verbal responses. Psychelinguistic rescarch 
on paraphrase and on verbal comprehension generally 
(e.g., CGleitman & Cleitman, 1970; Johnson-Laird, 
1974) suggests that listeners understand utterances, as 
they appear in surface structures, by inferring their 
underlying deep structures or the variables underly- 
ing deep structures. These represent, equivalently, ab- 
stract propositional knowledge, semantic intentions, 
private listener responses, or the antecedent variables 
(motivational, verbal, nonverbal) controlling verbal 
behavior. This, in a roundabout way, seems to be 
what Skinner intended in his discussions of para- 
phrase, translation, and understanding, 


MORE ON THE COMPLEMENTARITY OF 
FUNCTIONAL AND COGNITIVE THEORIES 


We come full circle back to problems of accounting 
for the behavior of the speaker, problems of speech 
production. For both cognitive theorists and Skinner, 
problems of speech production and speech reception 
are inextricably intertwined becausé, in any but thé 
most simple comprehension problems, the listener 
must function simultaneously as a speaker in order to 
function at all. The theoretical problem is to identify 
the determinants behind deep structure (behind prim- 
itive verbal behavior), for these determine both what 
a speaker says and how a listener understands a speak- 
er’s utterances. Here are cognitive statements of the 
problem: 


Common sense invites the view that what hap- 
pens in speech production is this: a speaker 
starts with a message he wants to communicate. 
. . . But what sort of thing is a “message’’? And 
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are we not begging the question of how a speaker 
chooses to utter a linguistic form if we say that 
the choice is contingent upon an (unexplained) 
previous choice of a message to communicate? 
. . . Nevertheless, it seems to us that there 1s 
much to be said for the old-fashioned view that 
speech expresses thought. . . . 

It seems reasonably clear that there are cases 
in which . . . thinking consists in merely saying 
to oneself bits of natural language which one 
might equally well have said aloud. The re- 
hearsal which often goes on in short-term memory 
tasks ...is a persuasive example, . . . and it 
is not implausible that some of the thinking 
that goes on in problem solving might consist 
in saying to oneself sentences or sentence frag- 
ments in one’s language. .. . 

But . . . it seems quite clear that underlying 
many mental capacities, there must be computa- 
tional processes which are carried out in codes 
other than natural languages. ‘lhe computations 
underlying problem solving and the intepsration 
of percepts and motor gestures in nonverbal 
organisms must be of this kind... . 

We are, in effect, commending a view of the 
cognitive organization of organisms which bor- 
rows heavily from the actual organization of 
multipurpose computers. Such devices typically 
perform their computations in an “internal” 
language which may be quite different from the 
languages in which they accept their inputs and 
encode their outputs, (Feder, Bever, & Garrett, 


1974, pp. $74-977) 


Questions about the character of mentalese, 
however hopelessly metaphysical they may at 
first appear, are not entirely beyond the reach 
of the combined methedelegiss of psychology 
and linguistics: we can imagine data which would 
bear directly on such questions. . . . (Fodor et 
al., 1974, p. 383) 


‘These passages illustrate a poimt made carlier in 
this chapter, that cognitive theorists focus on the hy: 
pothetical “computing” processes that must be occur- 
ring within the organism whenever complex enviren- 
mental “input” variables lead to complex response 
“outputs.” Research on artificial intelligence and com- 
puter simulation of complex “cognitive behavior” 
seems to hold great promise as a means of illuminat- 
ing the character of (or at least delimiting the possi- 
bilities for) such hypothetical processes. (For an ex- 
cellent and persuasive argument, see ‘Turner, 1971.) 
Behavior theorists interested in language and other 
complex behavior would profit from a greater famuli- 
arity with this information-theoretic approach. 
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On the other hand, cognitive theorists would profit 
from a greater familiarity with behavioral research 
on the role of complex environmental variables in 
operant behavior. As behaviorists have been amiss in 
ignoring the contributions of information theorists, 
so cognitive theorists have been amiss in minimizing 
the role of environmental factors in complex behav- 
ior. Not all the determinants of cognitive behavior are 
within the organism, out of direct reach of experimen- 
tal analysis. Psycholinguists seem not to be aware of 
the rich literature on stimulus control of operant be- 
havior and the effects of complex reinforcement con- 
tingencies, a literature which would go a long way 
toward explaining the origins of the semantic and 
pragmatic aspects of verbal behavior, and which un- 
doubtedly has relevance as well to the syntax of ver- 
bal behavior. 

Psychology seems to be maturing, at last, into a 
science. The balkanization of psychology into doc- 
trinaire schools, each with its separate language spoken 
only by initiates, is giving way to a unified conception 
of problems, methods, and theories. If this chapter 
persuades a few cognitive theorists and behaviorists to 
venture out from their partisan positions and join 
forces in grappling with the difficult problems of lan- 
guage, it will have succeeded in its aim. 
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Aversive control, 415-431 (see also Avoid- 
ance, Electric shock, Negative rein- 
forcement) 
Aversive control: 
and brain biochemistry, 606 
by-products, 415-430 
and generalization gradients, 450, 455- 
456 
human, 417— 420 
methods, 416-418 
operant baseline, 5, 415-430 
patterns of responding, 418-430 
and peptic ulcers, 597, 604-605 
Aversiveness: 
assessment, 415, 422 
in behavioral contrast, 76-77, 84-86 
and biological constraints, 115 
errors in programmed learning, 467, 
469-470 
forced running, 103 
heat, 165 
interim period, 139~140 
and relativity of reward, 102, 111-112 
stimuli with negative outcomes, 314, 
3195-325 
taste qualities, 527 
thermal stimuli, in human, 162-164 
US, and conditioned suppression, 341 
Aversive situation, as drive operation, 368, 
378-380 
Avoidance (see also Aversive control, 
Electric shock, Negative reinforce- 
ment, Shock delay, Shock deletion) 
Avoidance (of shock, by rats, unless other- 
wise specified), 367-410 
Avoidance: 
abortable sequence schedule, 366 
and alcohol ingestion, 599 
and aversive control, 425-427 
and blood pressure conditioning, 608 
by-product of escape, 364-365 
cardiovascular changes in, 599-604 
cat, 193, 605 
and conditioned reinforcement, 319 
and conditioned suppression, 358, 360 
discriminated 
and animal psychophysics, 516 
and EEG activity, 605-606 
and endocrine changes, 599 
drug effects, 555-557 
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EEG activity in, 605-606 

endocrine changes in, 599-601, 604 

free-operant, 5, 119, 185, 369-371, 380, 
597-606 (see also Avoidance, Sid- 
man; Shock delay, Shock deletion) 

functional relevance and, 113-119 

gastrointestinal changes in, 597, 604— 
605 

and generalization, 450, 455-456 

heat, by rats, 156 

and infectious disease, 597-598 

long-term effects, 380, 427, 429, 599- 
605 

multiple-variable-interval, 267-271 

operant-Paviovian interactions, &8 

and peptic ulcers, 597, 604 

pigeon, 13, 120, 450-451 

problems with concept, 364, 369-370 

nee HONEY’: 185. 366, 597, 599-601, 


and response strength, 260 

a8 shock-frequency réduction, 37 

squirrel monkeys, 185, 193-196 

temporal pattern of behavior, 495197 

theories, 364-365, 393, 396-398 

thermal change, 165 

variable-interval, 260, 267-271 
Avoidance, Sidman, 119, 185, 360-371, 380, 

426 
and physiological changes, 597-606 


and thermoregulation, 165 


Baboon: 
blood pressure conditioning, 609-610 
escape-avoidance, 601-602 
thermoregulation, 153, 166 
Background stimuli (see also Contextual 
stimulus): 
and stimulus control, 457-458 
Backward conditioning, 55, 61 
Baconian induction, 2, 12, 28 
Bad news, and conditioned reinforcement, 


‘ 


Barbiturates: 
chemistry, 544 
classification, 544545 
and conditioned suppression, 355-356 
and punishment, 558 
reinforcer, 557 
and schedule performances, 188, 191 
stimulus control by, 555 
tolerance, 550 
Barking, in sea lion, 519-520 
Bar pressing (see Lever pressing) 
Rasal ganglia, and electrical stimulation, 
585 
Base phrase marker, language theory, 
629-630 
Baseline: 
stability, and drug effects, 7, 543, 551- 
O02 
stimulus control of, and drugs, 555 
Basic strings, in language, 629-630 
Basking, in thermoregulation, 162, 164 
Bat, auditory sensitivity, 519, 523 
Batesian mimicry, 484 
Beak movement, as interim activity, 128 
Behavior, modification: 
and brain stimulation, 580 
and drugs, 565 
Behavioral arrest, and aversive control, 
425 
Behavioral clock, and induced activities, 
140-148 
Behavioral contrast (see also Local con- 
trast, Negative behavioral contrast, 
Positive behavioral contrast, Tran- 
sient contrast): 


and autoshaping, 73-91, 272-274 

by-product of discrimination, 467 

definition, 73 

determinants, 267, 274 

equations, 267, 270 

and errorless learning, 472-474 

and generalization, 442, 449, 454 

Herrnstein’s account, 266-268, 270-275 

and induced aggression, 469 

inhibition and, 75-76, 84-85 

and matching, 267, 29'79-995 

necessary and sufficient conditions, 73, 
76, 78, 83-85 

with negative reinforcement, 382 

and Pavlovian conditioning, 84-85, 8g-— 


89 

and peak shift, 449, 454 

quantitative account, 267, 270 

and reinforcement frequancy, 76-78, Q4, 
86 

response-independent 
133, 575 

and response suppression, 76-77, 8&4 

as schedule interaction, 88 

temporal properties, 75-78, 86 

theories, 75-91, 266-275 

Behavioral final common path, 144 

Behavioral homeostasis, and temperature- 
resulation, 15% 

Behavioral pharmacology, 3, 193-155, 164- 
169, 188-198, 340-860, 540-569 (see 
also Drugs, Psychopharmasolegy) 

basic variables, 543-551 
behavioral mechanisms, 551-560, 562 
principles of drug action, 543-551 

Benger state, in sequence of activities, 

Behavioral stream, and negative reinforce- 
ment, 409, 410 

Behavioral toxicology, 562-563 

Behavioral Variation, | 

Behaviorism: 

and language, 519-527, 633 640 
Skinner’s radical, l=? 

Benzadrinc (sce d-amphetamine) 

Benzodiazepine, and conditioned sup- 
pression, 356 

Reta adrenergic blockade, and blood pres- 
sure. 603 

BDetla splendens (yee Siamese hehting fish) 


echedules, 133- 


' Bias: 


animal psychophysics, and signal de- 
tection theory, 538-533 
concurrent schedules, 258-239. 247-248. 
BE1-S56 
drug selection, 544 
generalization gradient in extinction, 
AZR 
response, in animal psychophysics. 5106. 
518, 591, 584, 594 
Riconditional behavior, 70 
Binary fractionation model, 
structure, 6493-647 
Biochemistry, of brain, 606 
Biofeedback, 571, 580 
Biological control systems, 160 
Biological predispositions, 113, 122 
Biological relevance, 127 
Biology of association, 70 
Biotransformation of drugs, 550 
Birds (see also Chick, Pigeon, Quail): 
thermoregulation, 159-160 
Birth raté, and food availability, 34 
Bisection, animal psychophysics, 532 
Biting, rats: 
in cutomaintenance, 65 
in contrast experiments, 82 
Biting attack (see also Aggressive be- 


language 
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havior, Attack, Schedule-induced 

behavior): 

apparatus, 416-418 

intensity, 418 

mouse, 416-417 

patterns, 418-430 

punishment schedule, 428 

schedule-induced, 136 

shock-elicited, 183-184, 380 

and shock reduction, 420, 496 

squirrel monkey, 136, 183-184, 417-425, 
428 

Blackout (see also Timeout): 

as brief stimulus, second-order sched. 
ules, 301, 304 

delay of reinforcement, and matching, 
951-959 

fixed-interval schedule, 207-208 

fivdd-#atio séhédule, 910_919 

intertrial interval, and stimulus control, 

89400 


multiple schedule, and matching, PAS) 
verre and stimulus control, 499-500, 
50 


Blood flow, instrumental conditioning, 
571, 608 
Blood pressure: 
and conditioned suppression, E52, B58 
conditioning, 571 
and escape-avoidance, 601-GO4 
instrumental conditioning, 608=610 
measurement an recording, 558, 606 
Boa constrictor, thermoregulation, 162 
ody weight: 
and changeover responses, 244 
and escape-avoidance, 605 
and matching, 269 
and polydipsia, 132 
“Botanizing,” 2, 12, 80, 91 
Bout duration, induced astivitics, 141- 
142, 146 
Bowing response, pigcon, in automain- 
tenanea, 65 
Hoyle’s law. 276 
rachium conjunctivum, self-stimulation 
of. 590 © 
Brain chemistry, 606 
Brain damage: 
dogs, 20 
humans, 26 
er and thermoregulation, 


monkey, ind epiral aftoreffact, 620 
physiological study, 8, 19 
preoptic area, and thermorepulation, 
104 
rats, 16-18 
and self-stimulation, 582, 586=587 
stages im recovery from, 8, P7226 
Brain stem circuits, 58% 
Brain temperature, and thermoregulation, 
154, 158 
Break and run, on FI, 140 
Break point, on FI, 265 
Brief stimulus: 
and chained schedule, 296-297, 299 
on concurrent chains, 334 
and conditioned reinforcement, 316-318, 


discriminative effects, 300-309 
in extinction, 305-306 
hopper presentation as, 316-317 
physical properties, 304-305 
and second-order schedule, 
315-316 
Brightness: 

discrimination, 

pigeons, 521, 532 

rats, 526-527 


299-309, 
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Brightness (cont.) 
stimulus control by, pigeon, 490 
stimulus generalization, rat, 493 
Bristol board, 98, 114 
Brook trout, thermoregulation, 158 
Burst duration, in contingent-response 
experiments, 110-112 
Bursting: 
on concurrent VI schedules, 238, 246 
on interval vs. ratio schedules, 223 
in postshock period, 388, 394 
Button pressing, human, and matching, 
234-235 
Buzzer, stimulus control by, 484 
By-products: 
aversive control, 415-430 
discrimination learning, 467, 475 


CA (see Catecholamine) 
Caffeine, drug classification, 545 
Caloric density, 33, 35, 43 
Caloric regulation, 43, 47 
Cannabis (see Marijuana, 
cannabinol) 
Cannula, implanted, 60 
Carbachol, and thermoregulation, 169 
Carbon monoxide, behavioral effects, 563 
Cardiac output, and escape-avoidance, 
602-603 
Cardiorespiratory 
changes, 597 
Cardiovascular responses, operant con- 
ditioning, 606-611 
Cardiovascular system, experimentally in- 
duced changes, 598, 604 
Carnivores: 
caloric regulation, 47 
feeding patterns, 36, 47-49 
Castration, and central reinforcement, 579 
Cat: 
attack behavior, 14-15 
auditory intensity discrimination, 516 
avoidance, 193, 605 
caloric regulation, 43 
carnivorous eating pattern, 49 
color vision, 515 
EEG activity, 605-606 
electrical brain stimulation, 14-15, 589- 
590 
meal patterning, 35-39, 49 
meal size, 34 
patterning of drinking, 36 
psychophysics, 515, 516, 520 
rat-killing behavior, 14-15 
self-stimulation of brain, 589-590 
shock avoidance, 193, 605 
shock-delay procedure, 393-394 
thermoregulation, 153, 166-167 
visual discrimination, 520 
Catch trials, in animal psychophysics, 519 
Catecholamine: 
and avoidance, 600 
and heart rate conditioning, 608 
and self-stimulation of brain, 571, 585- 
588 
Catecholamine pathways: 
inhibition, 587-588 
and self-stimulation of brain, 585-589 
Catecholaminergic neurons, 571, 585-588 
Category symbols, in language, 630 
Catheter, intravenous, drug administra- 
tion by, 555-556 
Caudate nucleus, self-stimulation of, 590 
Causal factors, behavioral sequences, 143— 


Tetrahydro- 


system, transient 


Cebus monkey, contingent-response ex- 
periment, 102 
Ceiling effect: 
in conditioned suppression, 351, 354 


generalization gradient, 452, 482, 502 
Central grey matter, self-stimulation, 590 
Central limit theorem, and _ behavioral 

chaining, 142 
Central motive state: é 
behavioral state as, 144 
and electrical brain stimulation, 582- 
589 , 
Central nervous system, behaviorally in- 
duced changes, 605-606 
Central reinforcement, 570-590 (see also 
Electrical brain stimulation) 

acquisition with, 575 

advantages, 580 

vs. conventional reinforcement, 574-581, 

589 

extinction after, 575-576 

measurement, 572-574 

methodology, 572-574 

and motivation, 570, 576, 579-580, 582- 

589 ' 

persistence of behavior, 576-578 

and secondary reinforcement, 576 

species and sites used, 589-590 

theories, 581-588 
Central reward pathways, 582 
CER (see Conditioned emotional response, 

Conditioned suppression) 
Cerebral cortex: 

and inhibition, 588 

integrative functions, 641-642 

stimulation of, 570 
Cerebral peduncle, self-stimulation, 590 
Cervidae, 621 
Chain pulling: 

squirrel monkey, 420-421 

and thermoregulation, 164 
Chained schedule, 289-299 

central reinforcement, 573 

and choice, 335 

complex behavior on, 292 

concurrent (see Concurrent 

schedule) 

conditioned reinforcement on, 288-316 

controlling variables, 289-290, 293 

description, 289 

discriminative stimuli, 289, 293-299 

drug effects, 555 

and electrical brain stimulation, 18] 

interresponse times, 223 

interval schedules, response rate, 29]-— 

292 

maintained responding, 291-293 

order of stimuli, 295-296 

ratio schedules, response rate, 292-293 

and relativity of reinforcement, 181 

and tandem schedule, 290, 293-295, 298 

thermal reinforcement, 164 

transition performances, 290-291 
Chaining: 

and external stimuli, 142 

fixed-interval schedules, 221-222 

interim activities, 142 

sequence of behaviors, 141-143 

and temporal discrimination, 141-143 
Chaining delay, and gradient of reinforce- 

ment, 217-218 
Chaining hypothesis, 
221-222 
Changeover delay: 
animal psychophysics, 523 
concurrent schedules, 79, 242-244, 250, 
276 

minimum for matching, 242-244, 249 

and observing responses, 32] 

response rate during, 244 

role in matching, 242-244, 250, 276 

use, 235 
Changeover key, use, 234, 237 


chain 


schedule control, 
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Changeover response: 
punishment, 243-244, 253 
rate, 233, 243-244, 249-250, 254, 276 
Chemical structure, drugs, 544 
Chewing, schedule-induced, 128-133 
Chick: 
autoshaping, 59-60, 65, 70, 120-121 
nuzzling response, 65 
omission training, 65, 120-121 
thermoregulation, 153, 159-160 
wavelength generalization, 486 
Chicken: 
meal patterning, 35 
misbehavior, 13 
Chick-peas. reinforcer for pigeons, 259 
Chimpanzee: 
language behavior, $40, 650-651 
schedule-induced polydipsia, 129 
second-order schedule, 299, 302 
token reinforcement, 306 
Chinchilla, auditory intensity discrimi- 
nation, 515-516, 519 
Children, relativity of reward, 102 
Chloral hydrate, 545 
Chlordiazepoxide: 
acquisition with, 561 
classification, 545 
and conditioned suppression, 349, 356— 
357 
as depressant, 555-556 
and punishment, 558 
schedule control, 189, 191, 557 
stimulus control, 555-556 
Chlorpromazine: 
acquisition with, 561 
classification, 544-545 
and clock schedule, 555 
and deprivation, 553 
errorless learning, 553 
escape behavior, 384 
human retardates, 564 
negative reinforcement, 367, 384 
psychiatric use, 540, 565-566 
and punishment, 558 
schedule control, 188-189, 556-557 
shock avoidance, 556-557 
stimulus control, 555 
thermoregulation, 167, 556 
Choice, 233-282, 313-337 (see also Con- 
current schedules, Matching) 
absolute response rate in, 257-263 
additive difference model, 277 
central vs. conventional reinforcer, 577- 
578 
concurrent chains, 329-337 
concurrent schedules, 235, 239, 245-246, 
253-254 
and conditioned reinforcement, 313-337 
discrete trial procedure, 236 
immediate vs. delayed reinforcement, 
2514. 252 
interresponse time, 255-256 
and matching law, 275-276 
measurement, 235, 239 
and negative reinforcement, 366, 374- 
392 
and punishment, 253-254 
qualitatively different reinforcers, 252, 
262 
and required response rate, 334 
Cholinergic blocking agent, 169 
Cholinesterase inhibitors, and 
regulation, 165 
Cholinomimetic agents, 168-169 
Cingulate gyrus, self-stimulation, 589 
Circadian rhythms, and thermoregulation, 
161-162 
Circularity, of weak law of effect, 100 
Classical conditioning, 340-361 (see also 


thermo- 
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Pavlovian conditioning, Respondent 
conditioning) 
autoshaping, 119-122 
and brain stimulation, 576 
and conditioned suppression, 342, 344, 
351, 358, 360, 517 
descriptions, 9, 53, 61, 
drug effects, 553, 554 
fear, and avoidance, 393-397 
galvanic skin response, 493 
gerbils, 115-116 
human, 20, 493 
and induced behavior, 125-126 
misbehavior of organisms, 3 
necessary and sufficient conditions, 348- 
344 
and operant behavior, 340-361 
pigeons, auditory frequency, 489 
rabbit: 
auditory frequency, 489 
eyelid conditioning, 493-496, 498 
Click rate, stimulus control by, 496 
Clock schedule: 
and chain schedules, 296-297, 299 
drug effects, 555 
and gastric ulcer, 604-605 
CNV (see Contingent negative variation) 
CO (see Changeover) 
Cocaine: 
classification, 545 
and concurrent matching, 250 
reinforcer, 192-193, 250, 557 
self- -administration, 548 
and self-stimulation of brain, 585 
Cochlear nucleus, EEG activity, 605 
Cockroaches, meal patterning, 35 
COD (see Changeover delay) 
Codeine, reinforcer, 192 
Coding, language, 620, 629-697 
Cognitive dissonance, 9 
Cognitive learning, and verbal behavior, 
636, 649 
Cognitive processes, and central reinforce- 
ment, 582 
Cognitive theories of language. 628-633, 
638, 640-642, 651-652 
CO-key concurrent schedule, 234-237 
Cold escape, rats, 261, 263, 282 
Collateral behavior (see also Adjunctive 
behavior, Interim activities. Sched- 
ule-induced behavior): 
and choice on IRT schedule, 255, 266 
and conditioned suppression on DRL, 
347 
and temporal discrimination, 143 
College students: 
lever pressing. 103, 107 
relativity of reward, 102 
wheel cranking, 103, 107 
Color circle. animal psychophysics, 
531 
Color coding, animal psychophysics, 530 
Color vision: 
cat, 515 
goldfish, 517, 522 
monkey, 518 
Combined cues method, generalization 
gradient, 446-447 
Combined cues test, for inhibitory control, 
470-472 
Comfort activities, as facultative behavior, 
135 
Command neurons, 588 
Common fate, law of, 623 
Comparator device, 638 
Compatibility, instrumental and uncondi- 
tioned response, 118-122 
Competence, linguistic, 628-629 


138, 340, 489 


530- 


Competing responses: 
choice situation, 257, 262 
and conditioned suppression, 351-353, 
358-359, 360 
and delay of reinforcement, 218 
and inhibitory control, 463-464 
shock situation, 260-261, 408-410 
and stimulus generalization, 438-439 
Competition: 
associative, 507 
behavioral states, 144-147 
stimuli, 496, 498, 507-508 
terminal and interim activities, 132-135, 
145-146 
Complex symbol, in language, 648 
Compound stimulus, and — stimulus 
generalization, 439, 448-444, 459- 
463 
Comprehension, language, 649-651 
Concept learning: 
and animal psychophysics, 525 
and verbal behavior, 634 
Concepts, linguistic, 619 
Conceptual nervous system, 32 
Concurrent chain schedule: 
and conditioned reinforcement, 303, 
315, 327-337 
description, 297, 327 
matching on, 252, 297-298, 29-333 
strensths and weaknesses, 327-329 
terminological problem, 298 
Concurrent generalization test, A385 
Concurrent models, altered physiological 
states, 596-606 
Concurrent schedules, 233-282 
Concurrent schedules: 
absolute response rates, 265-966 
behavioral pharmacology, 551 
brief stimulus presentation, 306-308 
central reinforcement, 572=575 
conditioned reinforcement, 326-337 
and conditioned suppression, 355 
and contrast, 78-80 
description, 233-234 
electric shock, 374-375, 386, 388-592 
and extradimensional training, 501-503 
generalization test, 435 
and inhibitory control, 463-404 
interactions, 78-80, 233-956, 265-066 
matching, 78-80, 233-256, 272=276, 329= 
553 
generality, 248-2956, 275, 276 
and multiple schedule, 272 
and negative reinforcement, 254 
and punishment, 203-254 
reinforcement frequency, 095-944, O47, 
Vida os PaBiad ye 
reinforcement immediacy, 993, OF 1-2" 
reinforcement magnitude, 233-234, 248—- 
‘201 
negative reinforcement, 954, 3'74-875, 
586, 388-392 
and observing responses, 32] 
and peak shift, 463-464 
punishment, 253-254 
types, 234-245; 254-255 
Concurrent superstitions, 234-235, 243 
Conditionability, negative reinforcement, 
407-408 
Conditionable response unit, 222-227 
Conditional discrimination, and _ verbal 
behavior, 636, 646 
Conditioned aversive stimulus: 
and concurrent choice, 253 
and negative reinforcement, 
400 
and stimulus generalization, 447 
Conditioned cardiac respondent, 598 
Conditioned confusion, 317 


393-396, 
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Conditioned emotional response (see also 
Conditioned suppression): 
and conditioned suppression, 353-357, 
and latent inhibition, 487 
and overshadowing, 492 
partial reinforcement effect, 61 
physiological changes, 598-599 
Conditioned enhancement, 88-90 
Conditioned flexion response: 
dog, 483-484 
goats and sheep, 487 
Conditioned inhibition: 
and contrast, 84-85 
and errorless learning, +75 
and latent inhibition, 487 
measurement, 445-447 
Conditioned reflex (see Reflex) 
Conditioned reinforcement, 288-309, 313- 
337 (see also Secondary yeinforce- 
ment) 
chained schedules, 
309, 313, 315 
choice and, 313-314, 396-837 
concepts, 160, 288-289, 309, 313-315 
concurrent nen schedules, 326-337 
conjoint schedules, 307-308 
discriminative effects, 289, 309, 314-815 
discriminative stimulus hypothesis, 181 
drug effects, 
hypotheses, 318-315, 325-326, 336-337 
and information, 303, 313-337 
mathematical treatment, 4 
observing responses, 315-315, 318-326, 
336-337 


988-9890, 903-990, 


pairing hypothesis, 313, 315-318, 334, 
336-337 


paradigm experiment, 288 

relation to primary. 289 

schedule control, ae B88_8O9 

second-order schedules, 309-200, 313 
316 

uncertainty reduction, 318-326, 306-337 

and verbal behavior, 633 

Conditioned suppression, 340-363 (sce aise 

Conditioned amotional Fasponse, 
Positive conditioned SED ESO) 

and anxiety, 355-957, 360, 398 

and autoshaping, 90, "859 

and ayeidancs behavior, 358, 390 

classical conditioning, Z40_361 

concurrent choice, 255 

and contrast, 88 

description, 341=342 

drug effects, 842, 850, 355-857, 860, 554 

Estes:Skinner procedure, 341-344 

gerbils, 

hypotheses, 351-358 

ee hypothesis, 351-333, 3597- 

and latent inhibition, 487 

measurement, 348-351, 360 

monkey, 89, 352, 358-360 

motivational hypotheses, 

mouse, 517 

and olfactory sensitivity in pigeon, 513 

and operant parameters, 5, 344-348, 351 

as Pavlovian fear conditioning, 315 

physiological changes, 598-599 

punishment hypothesis, 357-358 

schedule effects, 345-351, 360 

shock deletion procedure, 409 

shock intensity, 342, 346 

shock probability, 343-344 

Siamese fighting fish, 116 

animal psychophysics, 516-517, 522, 525 

Conditioning context, and _— stimulus 

generalization, 444 


353-357, 358 
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Conflict: 
approach-avoidance, 
and gastric ulcer, 597, 604 
and schedule induction, 139-140 
and de-encephalization, 23 
Conflict procedure, and gastric ulcers, 597, 
604. 
Conjoint schedule: 
brief stimulus presentation, 306-308 
and conditioned reinforcement, 304 
Conjunctive schedule: 
and delay of reinforcement, 219 
number of responses, 206-21] 
Consequence variables, psychopharma- 
cology, 556-559 
Constant probability schedule, and in- 
duced drinking, 135 
Constant probability VI schedule, 215 
Constraints: 
biological, on learning, 3, 7, 91, 101, 
111-123 
biological, on reinforcement, 112-123 
environmental, on feeding, 37, 44 
genetic, on language, 641 
on reinforcer availability, 40-41, 47 
species, 
on autoshaped response, 61 
and central reinforcement, 580 
on generalization gradients, 435-436 
on stimulus control, 483-485 
systems (see Systems constraint) 
Consummatory behavior (see also Drink- 
ing, Eating, Feeding behavior): 
US. appetitive behavior, 22 
autoshaping, 62, 70 
in contrast experiments, 81-82 
monkey, 58, 65 
pigeon, in autoshaping, 58, 65, 69-70 
as reflexive, 22, 28, 49, 62 
Consummatory force, 144, 146 
Consummatory response: 
relation to operant behavior, 3, 20, 22, 
62-68, 127 
and terminal response, 127 
and theta waves, 119 
Consummatory response theory, and cen- 
tral reinforcement, 581 
Contact comfort, and thermoregulation, 


Context, and conditioned reinforcement, 
315, 330 
Contextual stimulus (see also Situational 
stimulus): 
and eee control, 444, 457-458, 488—- 
09 
Contiguity: 
and conditioned reinforcement, 
318, 334, 336 
induced drinking and food, 132 
response-reinforcer, 
interval and time schedules, 214, 219 
problems with, 127-128 
on response-dependent schedules, 204, 
207, 228 
in verbal behavior, 637-638 
Contingencies of reinforcement, 3, 98-124, 
175-176 
Contingency: 
and aversive control, 425-430 
categories, 99, 101 
negative, 63, 99 
vs. pairing 
in autoshaping, 54-56 
reinforcement 
in animal psychophysics, 525 
response, and schedule induction, 125- 
128, 130-136, 140-141 
among responses, 98-123 
response-shock deletion, 398-399 


314— 


and shock probability, 375-377, 398-399 
shock, and extinction, 379-380 
shock, and frequency, 398-399 
temporal, in classical conditioning, 343 
Contingency models, altered physiological 
states, 606-611 
Contingency strength: ns 
quantitative measure, 128 
and terminal responses, 127 
Contingency table: 
reinforcement and punishment, 180 
Contingent variation in EEG: 
Negative, 606 
Positive, 606 dae? 
Contingent-response (Premack-type) ex- 
periments, 98-123 
Continuity, behavior in time, 177-178 
Continuous choice procedure (see Con- 
current schedule) 
Continuous reinforcement schedule, cen- 
tral reinforcement, 573, 577 
Contrast, behavioral (see Behavioral con- 
trast, Local contrast, Negative 
behavioral contrast, Positive be- 
havioral contrast, ‘Transient con- 
trast) 
Control (see Aversive control, Schedule 
control, Stimulus control) 
Controlling variables, types, on schedules, 
203-204, 228 
Control procedures (see also Truly 
random control, Yoked controls): 
animal psychophysics, 515-532 
behavioral pharmacology, 551-552 
classical conditioning, 55-56, 61, 84, 120, 
343 
concurrent matching experiments, 242- 
243 
contingent-response experiments, 
110, 115-116 
and stimulus generalization, 435-436, 
442, 445 
Control systems, and thermoregulation, 
160-161] 
Control theory, and thermoregulation, 
153-154, 160-162 
Coping response, and _ gastrointestinal 
changes, 604-605 
Copulation, 109, 122 
Correction procedure, 
physics, 524, 536 
Correlation-based law of effect, 111 
Cortical evoked activity, 605 
Cost: 
procurement vs. use, 47 
response, 239, 563 
Counterexperiment, method of synthesis, 
10 
Counting schedule, conditioned suppres- 
sion, 347 
Courtship: 
lock and key sequence, 142 
pigeons 
autoshaping, 59, 119 
as interim activity, 137 
sticklebacks, 121-122 
CR, definition, 340 
CRF (see Continuous reinforcement sched- 
ule 
Criteria, bie danens vs. punishment, 
186-188 
Criterion, in animal psychophysics, 521, 


106- 


animal psycho- 


Cross tolerance, drugs, 550 
CS (see also Warning stimulus): 
in autoshaping, 55-61, 69-70, 80 
and conditioned enhancement, 90 
in conditioned suppression, 89-90, 342- 
345, 350, 359 


Subject Index 


in contrast, 88 
definition, 340 
fear conditioning, 315 
informativeness, 58, 80 
Cue (see Stimulus) 
Cued escape, 383-387 
Cues, acquired distinctiveness, 487-488 
Cumulation, drug effects, 550-551 
Cumulative record, advantages, 1] 
Cumulative recorder, history, 29 
Curare, and autonomic conditioning, 607- 
611 
Curiosity, 31 
Cyclicities, fixed-interval behavior, 208- 
209 


Cylert (see Magnesium pemoline) 


d’, signal detection theory, 532-537 
DA (see Dopamine) 
d-amphetamine (see also Amphetamine): 
classification, 545 
and learning, 561 
and learning disability, 564 
and negative reinforcement, 367 
reinforcer, 192 
and schedule performance, 190-191 
and thermoregulation, 164 
Dark adaptation, pigeons, 521-522 
Darkness, rearing in, 486 
Deafferentation, 146 
Debilitation, and thermoregulation, 155 
Decay, rate of contingent response, 108 
Decerebrate, 8, 11 
Decremental generalization 
definition, 445 
De-encephalization, 22-23 
Deep structure, of language, 629-633, 648- 
65] 


gradient, 


Defecation, and conditioned suppression, 
351 
Defensive behavior, 115-117 
Deficit, sensory, and thermoregulation, 
155-156 
Deficit theory of motivation, 31, 49 
Delayed conditioning, classical, 343 
Delayed escape, 261-262, 281-282 
Delay of reinforcement (see also Rein- 
forcement, immediacy): 
delayed escape, 261 
and habit strength, 217-218 
human, 262 
and matching, 234, 251-252 
and observing responses, 320, 322, 325 
and response strength, 260-262, 266, 281 
and schedule control, 218-222 
theory, 217-220 
Delay of reward procedure, 125 
Delay reduction hypothesis: 
conditioned reinforcement, 
324-337 
equations, 329, 332 
quantitative analysis, 329-333 
Demerol (see Meperidine) 
Density: 
negative reinforcement, 369-370, 375, 
399, 402 
reinforcement 
in autoshaping, 61 
and conditioned reinforcement, 314, 
316, 325-327, 329, 332, 336-337 
differentiation schedule, 224 
drug effects, 552 
and induced behavior, 140 
in time, 214-215 
Dependence: 
on drugs, 185, 192-193, 550, 553 
human, 557-558, 563 
Depletion-repletion model: 
problems with, 31-32 


313-315, 
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in regulation of eating, 31, 45-46, 49 
Depressant drugs, 544, 555-556, 561-562 
Deprivation: 

autoshaping, 59 

and central reinforcement, 575-576, 579- 

580, 584 
and conditioned suppression, 354 
contingent-response experiments, 102- 
105; 215, 113,122; 161 

effect of reinforcement, 178, 186 

and magnitude of reinforcement, 259 

operation, 30-31 

and polydipsia, 130, 132 

psychopharmacology, 553 

and rate of eating, 175 

and response-produced shock, 182 

and thirst, in induced drinking, 138- 

139 

as unnecessary, 36 

and verbal behavior, 633 
Descending order of stimuli, in animal 

psychophysics, 52] 
Desoxyn (see Mcthamphctaminc) 
Development, gape response, in thrushes, 
19 
Development, stages: 

compared to recovery, 17-22 

grasp, in human, 

rat, 18 
Diabctes, chronic, and meal frequency, 35 
Diabetes mellitus, 10 
Diazepam: 

classification, 545 

and punishment, 558 
Diencephalon: 

stimulation, 577 

and thermoregulation, 158 
Diet: 

selection (see Self-selection) 

and thermoregulation, 162 
Dicter’s nucleus, 606 
Difference concept, in animals, 518 
Differential probability rules, 102-108, 

111-113, 122, 181 
Differential reinforcement: 

animal psychophysics, 528-529 

definition, 433 

and generalization gradient, 430-44] 

implicit, 489-491 

in shaping, 177 

and stimulus control, 432-476, 493-505 
Differentiation schedule (see aiso DRL 

schedule, Interresponse time sched- 
ule): 

definitions, 203 

and IRT as conditionable unit, 994 
Diffuseness, and stimulus control, 490, 508 
Digestion: 

herbivores, 33=34, 47=48 

and thermoregulation, 162 
Digging: 

gerbil, 115-116 

hamster, 113-118 
Dimensional control: 

and carly experience, 485-487 

inhibitory, 453-456 

and negative reinforcement, 391-392, 

398, 400 

and stimulus generalization, 444-447 
Dimension of generalization, 433 
Diphenhydramine, 564 
Direct variables, and schedule control, 

203-206, 228 
Directedness: 

autoshaped response, 60-61, 69, 81 

Pavlovian responses, 61, 81, 138 
Discrete-trial procedure: 

animal psychophysics, 516 

escape as, 367-368 


food and shock, 485 
generalization gradient, 435 
go/no-go method, 516 
and immediacy of reinforcement, 260, 
281 
matching, 236, 244, 246 
observing responses on, 320 
and overshadowing, 492 
shock deletion, 399 
and stimulus control, 495, 505 
threshold tracking, 522 
Discriminability, index of, 534 
Discriminated avoidance (see Avoidance, 
discriminated) 
Discrimination: 
components of second order schedules, 
316, 318 
extinction, and negative reinforcement, 
378-380 
problems with concept, 43% 
simultancous vs, successive, 487 
Discrimination learning: 
crrorless (see Errorless discrimination 
learning) 
and escape from S—, 469-470 
free operant, 499_505 
patterns, by pigeons, 528 
sign tracking in, 
Spence’s theory. 447~451. 455, 456, 462, 
476 
and stimulus control, 499-505 
Discrimination training: 
and attentiveness, 500-502, 505 
definition, 433 
and generalization, 439-453 
and inhibition, 4329, 439-453 
schedule effects, 503-505 
and stimulus control, 4938-505 
types, 439-440 
Discriminative control (see also Stimulus 
control): and conditioned suppres- 
sion, 347-348 
Discriminative stimulus: 
chained schedules, 289, 298-299 
concept, 180 
concurrent chains, 536 
and conditioned reinforcement, 314-315 
drugs as, 199-193, 555-556 
negative reinforcement, 381-396. 400 
in polydipsia, 129, 151, 138 
second-order schedules, 302-309 
in verbal behavior, 655-035 
Discriminative stimulus rypothesis, con- 
ditioned reinforcement, 181 
Disinhibited activities, 139-146 
Disinhibition: 
and conditioned suppression, 347=348 
and induced activities, 139, 143-140 
and motor control, 
by novel stimulus, 116 
and temporal discrimination, 143 
Displacement activities, 09 ; 139 
Distribution: 
drugs. 543, 547-550 
interresponse times, 224-995 
Disulfuram, 563 
Diurnal rhythm: 
drinking 
rats, 36, 44 
eating, , 
guinea pigs, 35 
rats, 33, 444 
Dog: 
auditory stimulus control, 483-484 
autoshaping, 58 
avoidance, 406, 602-605 
brain damage, 20 
cardiovascular changes, 602-603 
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classical conditioning, 61, 138, 483-484, 
489-4992 
conditioned flexion response, 483-484 
drugs as discriminative stimuli, 555-556 
gastric motility, 20 
hippocampal activity, 605 
hypothalmic syndrome, 20 
meal patterning, 35 
negative reinforcement 
shock-delay procedure, 390, 392-393 
omission training, 63 
patterning of drinking, 36 
Pavlovian conditioning, 61, 138, 483- 
484, 489 
salivation 
classical conditioning, 138, 340, 489 
instrumental conditioning, 60 
self-stimulation of brain, 589=590 
thermoregulation, 153, 159, 166-167 
Dolphin, self-stimulation of brain, 590 
Dominance, and feeding behavior, 34 
Domino theory, sequence of behaviors, 141 
Dopamine: 
and self-stimulation of brain, 585-586 
and thermoregulation, 168 
Dopamine-p-hydroxylase, and 
phrenia, 571 
Dopaminergic pathways, 585-588 
Dorsal noradrenergic pathway, 586 
Dose-effect relations, drugs, and schedule 
performance, 189, 199-193, 202 
Dose-response curve, behavioral pharma- 
cology, 541, 545-547 


schizo- 


Dove: 
schedule-induced cating, 139 
schedule-induced polydipsia, 129, 136 
thermoregulation, 153 
DRH schedule (sce also Differentiation 
schedule): 
and choice, 334 
negative reinforcement, 391 
Drinkin (see also Licking, Polydipsia, 
Schedule-induced behavior): 
adjunctive (see also Drinking, interim; 
Drinking, schedule-induced: Poly- 
dipsia, Schedule-induced behavior) 
and conditioned suppression, 350 
aversive control schedule, 493-495 


contingent-response experiments, 102- 
109, 181 

elicited by beain stimulation, §71, 579, 
581=583, 589 


instrumental, 165-106 
instrumental avoidance, 113 
interim, 128-180, 194-183, 142, 144-147 
atiernin 
and FR requirement, 40, 42 
guinea pigs, 42 
vats, 836, 10, 18 
postprandial (see Postprandial drinking) 
reinforcer vo, punisher, 181 
relation to feeding, 85-96, AV-AZ, 131, 
135, 139 
schedule induced (see also Polydipsia) 
compared to attack, 137 
development, 131-139, 138 
fixed-interval schedules, 135 
hypotheses, 130-132 
interaction with running, 146-148 
as interim activity, 128-130, 134-135, 
142, 144-147 
monkey, 138 
motivation, 132, 138-139 
rat, 126, 129-137, 143 
rate, 129-135, 138 
temporal locus, 133, 135, 140 
theories, 130-132 
and thirst, 138-139 
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Drinking (cont.) 
squirrel monkey, and aversive control, 
423-425 
Drinking tube, 98, 102, 104, 107, 108, 118, 
181, 417 
Drinkometer, 181, 417 
Drive: 
emotional, conditioned suppression, 
354-355, 358 
level: 
and immediacy of reinforcement, 260, 
281 
and magnitude of reinforcement, 259, 
278 
and matching, 262 
Drive induction: 
and central reinforcement, 581 
theory, 99 
Drive operation: 
central reinforcement, 576, 581, 586 
negative reinforcement, 368, 378-380, 
387-388 
Drive reduction: 
and electrical brain stimulation, 571, 
581-582, 584, 586, 588-589 
theory, 99 
and verbal behavior, 633 
Drive state: 
and central reinforcement, 576, 579-584 
and schedule control, 188 
DRL schedule (see also Differentiation 
schedule, Interresponse time sched- 
ule): 
and verses feeding pattern, 48 
central reinforcement, 578-579 
component of chain, 293 
component of conjoint, 307 
and conditioned enhancement, 90 
conditioned suppression, 89, 347, 359 
and contrast, 73, 77 
drug effects, 552 
and generalization gradient, 436, 439 
induced attack, 137 
induced behavior, 134—135, 141-143 
induced drinking, 135 
negative reinforcement, 391 
in omission training, 66 
problems with terminology, 203 
DRO schedule (see also Differentiation 
schedule): 
and behavioral contrast, 75, 77, 267 
component of second order, 306 
Drug abuse: 
animal models, 192-193 
and behavioral pharmacology, 542, 557- 
558, 563 
Drug effects, micro-analysis, 562 
Drugs (see also Behavioral pharmacology, 
Psychopharmacology, specific drug): 
absorption and distribution, 543, 547- 
550 
and ageression, 417, 552, 562 
behavioral mechanisms, 551-560 
behavioral pharmacology, 5, 188-193, 
349-360, 540-569 
and blood pressure, 603 
classification, 544-545 
clinical use, 188, 540, 542, 563-566 
and conditioned suppression, 349-350, 
355-357, 360 
cumulative effects, 550-551 
dose-response curves, 189, 192-193, 202, 
541, 545-547 
and electric shock, 189-191, 552, 555- 
560 
and electrical brain stimulation, 188, 
571, 574, 585-586 
fate in body, 550 
and motivation, 188-189, 559-562 


and negative reinforcement, 367, 384 

principles of action, 543-551 

and psychotic behavior, 564-566 

psychotropic, 545 

reinforcement vs. punishment, 178, 188- 
193 

reinforcers, 192-193, 556-558 

and relativity of reward, 179-181, 188— 
193 

reversibility of effects, 551 

and schedule control, 188-193, 202, 542- 
543, 546, 551-552, 555-560 .. 

self-stimulation procedure, 571, 574, 
585-587 

sensation and perception, 562 

solubility, 549-550 

specificity of action, 551-552 

and thermoregulation, 153-155, 164— 
169, 188-189, 556 

time course of effects, 548-549 

toxic effects, 546, 562-563 

type classifications, 544-545 


Dry mouth: 


and grooming in hamster, 144 
theory of polydipsia, 129 


Duckling, stimulus generalization, 485-486 
Duration: 


bout, of induced activity, 141-142, 146 
brief stimulus, second-order schedules, 
304-305 
burst, contingent-response experiments, 
110-112 
component, in multiple schedule 
and contrast, 79, 87, 90 
and matching, 269-272 
and negative reinforcement, 382-383 
CS, and conditioned suppression, 89 
as discriminative stimulus, 218-219, 
534-535 
interim activities, control by food, 131 
intertrial interval, autoshaping, 57-58, 
66, 78 
interval component, chained schedules, 
291-292 
of peck 
autoshaping, 24, 58, 67-68, 81 
and contrast, 87-88 
on fixed-interval, 67 
on fixed-ratio, 67 
food vs. water, 58-59, 67, 81, 119 
postreinforcement pause 
on fixed-interval, 213, 216 
on fixed-ratio, 209, 213, 217, 226-298 
prefood stimulus, and positive condi- 
tioned suppression, 359 
preshock stimulus, and conditioned sup- 
pression, 344-345, 350 
of ratio run, as conditionable unit, 227 
reinforcement (see also Reinforcement, 
magnitude) 
and absolute response rate, 259 
and conditioned enhancement, 90 
discrimination of, 249 
responding, contingent-response experi- 
ments, 108-112 
of S—, and inhibition, 447 
stimulus, and conditioned suppression, 
344-345, 350, 359 
terminal links, concurrent chains, 327- 
337 
trials, autoshaping, 57-58, 66, 78, 90 


Dynamic effects, and schedule perfor- 


mance, 204-210, 228 


Dynamic models, stimulus control, 451- 


453 


Eating (see also Feeding behavior): 


curve of, in rat, 174-175 
development, in rat, 17-18 


Subject Index 


electrically-evoked, 15 
lateral hypothalamus, 16-18 
patterns, and evolution, 33, 46 
periodicity, 28-30 
rate, and FR requirement, 41-42 
recovery, after lesions, 17, 20 
as reflex, 29-31 
relation to drinking, 35-36, 42-43, 131, 
135, 139 
schedule-induced, in doves, 139 
and schedule-induced polydipsia, 129- 
130, 135 
Eatometer, 29 
Echoic, in verbal behavior, 633-634, 637, 
639, 644 
Echolocation, in bat, 519 
Ecological niche, and feeding patterns, 33, 
43, 46-49 
Economics, of feeding behavior, 34, 46, 48 
Ectostriatal brain stimulation, and choice, 
252 
Ectostriatum, self-stimulation of, 590 
Ectotherm, thermoregulation in, 156-161, 
166 
EEG (see Electroencephalogram) 
Effort, of response, and thermoregulation, 
4 


Egg rolling, in goose, 23 
o, 9 
EKG (see Electrocardiogram) 
Elavil (see Amitryptaline) 
Electrical brain stimulation (see also 
Central reinforcement): 
and animal psychophysics, 518 
autoshaping with, 60 
and behavior, 570-590 
cat, 14-15, 589-590 
choice, 252 
concurrent schedules, 237 
conditioned suppression, 358-359 
diencephalon, 577 
and drugs, 188, 571, 574, 585-586 
elicitor and reward, 14 
gerbil, 111-112 
hypothalamus, 14-15, 60, 111, 164, 178 
181, 259, 570-589 
intrapeduncular nucleus, 585 
locus coeruleus, 585 
medial forebrain bundle, 577 
methodology, 572-574 
and motivation, 14, 570-589 
partial reinforcement, 578-579 
pigeons, 252 
postponement, 178 
rat, 14, 60, 89, 120, 164-165, 178, 180, 
237, 259, 279, 518, 572-590 
ge a magnitude, 179, 259, 263, 
279 
reinforcer vs. punisher, 178-181 
and relativity of reward, 102, 181 
reward in visceral conditioning, 607-611 
septum, 570, 575-576, 583 
species and sites used, 589-590 
substantia nigra, 585 
telencephalon, 577 
and thermoregulation, 164-165 
Electric shock (see also Avoidance, Escape, 
Negative reinforcement, Shock de- 
lay, Shock deletion): 
in animal psychophysics, 516-517, 523 
and attack behavior, 183-184, 380, 409, 
417-425 
autoshaping with, 65, 70 
in aversive control, 415-430 
avoidance (see Avoidance, Negative re- 
inforcement, Shock delay) 
and central reinforcement, 574 
changeover response, 243-244, 253 
concurrent matching, 253-254 


Subject Index 


and conditioned reinforcement, 304-305 

and conditioned suppression, 341-356 

and contrast, 74-77 

delay of escape, 261-262, 282 

density-frequency continuum, 369-370, 

375, 383 

density reduction (see Shock density re- 
duction) 

as discriminative stimulus, and negative 


reinforcement, 368, 369, 377-380, 
383 

and drug effects, 188-191, 555-556, 558— 
559 


escape (see Escape) 
gerbils, 115 
intensity: 
and choice, 253 
and conditioned suppression, 342, 346, 
354 
and number of responses, 194 
intensity reduction, negative veinforee- 
ment, 375, 380 
intermittent schedules, 375-377 
and latent inhibition, 487 
and negative induction, 74 
negative reinforcement, 364-410 


postponement, 185, 194-196 (see also 
Shock delay) 

reinforcer ws. punisher, 178-180, 183- 
184, 193-197 

response-produced, 176-185, 190, 193— 


196, 580 
Siamese fighting fish, 116 
squirrel monkeys, 173-179, 185~185, 189- 
190, 193-196, 415-430 
Electrocardiogram, operant conditioning 
of, 608 
Electrode implantation, methods, 572 
Electroencephalogram, and avoidance, 
605-606 
Electromyogram, 417-418, 606-607 
Electrophysiological changes, behaviorally 
induced, 605-606 
EMG (see Electromyogram) 
Emitted behavior, vs. elicited, 174-175 
Emotion (see also Motivation, Condi- 
tioned emotional response): 
and conditioned suppression, 353-358 
Encephalization, 20-23 
Encoding, in natural language, 620, 622= 
627 
Encoding specificity, and memory, 638 
Endocrinological system, durable changes, 
597-604 
FEndotherm, thermoregulation, 150 
Energy Sas regulation, 31, 33, 43, 47, 
12 


Enhancement; 
conditioned (see Conditioned enhance- 
ment) 
responding, by shock, 185 
stimulus control, 500-509 
Environment: 
and feeding patterns, 33, 36, 44-50 
simplification, 11 


Environmental constraints (see Con- 
straints) 

Environmentalism, and language, 641 

Epinephrine: 


and heart rate conditioning, 608 
and stimulus control, 555-556 
Episodic memory, 637 
Equal-brightness contour, rat, 526-527, 
532 
Equal-loudness functions, animal psycho- 
physics, 526, 532 
Equalizing, vs. matching in concurrent 
schedules, 272 
Equanil (see Meprobamate) 


Equipotentiality, premise of, 91 
Error: 
aversiveness: 
in programmed learning, 467, 469-470 
in discrimination learning, definition, 
464-465 
in operant behavior, 22, 23 
Errorless discrimination learning, 5, 76, 
86, 464-476 
Errorless discrimination learning: 
and agression, 468-469 
and ae contrast, 76, 86, 472- 


definition, 464-465 
drug effects, 553 
inhibitory stimulus control, 470-474 
and stimulus control, 186, 464-476 
reconsideration, 432, 464-476 
resistance to reinforcement, 446 
techniques, 466 
Terrace’s theory, 467-468 
as transfer of control, 465-466, 475 
Error signal, in thermoregulation. 160= 


Escape (See also Avoidance, Negative re- 
inforcement 
(from electric shock, by rats, unless 
otherwise stated): 
added cues, 383-387 
and SH control, 99, 416, 421, 425- 


centrifugal force, 367 
as change of situation, 387 
cold water, by rats, 261, 282 
and conditioned reinforcement, 319 
cued, 383-387 
delayed, 261-262, 281-282 
drug effects, 557 
free operant, 368 
heat, 165, 166, 168 
intense light, 367 
noise, 282, 367 
pigeon, 
procedure, 367-568 
and response strength, 261, 281-282, 
543 
rotation, 367 
runway, 261-262, 281-282 
squirrel monkeys, 189, 380, 557 
stability of, 543 
temperature change, 168, 367 
temporal pattern of behavior. 425-427 
Escape i avoidance distinction, problems 
with, 368-870, 387 
Estrus, rat, and thermoregulation, 162 
Ethanol (sea alsa Aleshol)}: 
reinforcer, Tats, 000, 061 
and sitmulus control, 555-556 
lithology, 4. 7, ll=12, 15, 19-20, 49. 91. 
156, B71-B78 
Ethyl aleohol (see Kthanol, Alcohol) 
Evolution, feeding patterns, 33, 40-50 
Excitation: ; 
and dimensional control, 444-445 
by reinforcement, 264, 272-274 
Excitatory stimulus, definition, 444 
Excretion, drugs, 550 
Expectancy, electrical signals of, 606 
Expectation: 
as behavioral state, 144 
in Pavlovian conditioning, 138 
Experimental analysis of behavior, 1-2 
“Experiments of fruit,” 12 
“Experiments of light,” 12 
Exploratory behavior: 
and motivation, 584 
rats, 60, 120 
Extended chained schedule, 291 
Extension, verbal, 634 
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Extinction: 
brief stimulus during, 305-306 
central reinforcement, 575-576, 589 
and conditioned reinforcement, 
289, 299 
and contrast, 71-77, 80, 82-83 
drug effects, 560-562 
after fixed-ratio, 223-226 
generalization gradient in, 435, 438, 597 
heart rate conditioning, 607 
massed, and peak shift, 456 
a and stimulus control, 456-457, 
52 
multiple schedule, 71-77, 80-83 
after negative reinforcement, 377-380 
non-passive effects, DDG 
observing response procedures, 318-326 
relation to reinforcement, 9305 
resistance to (see Resistance to extinc- 
tion 
after shock schedules, 377-380 
Extinction-induced ageression, 462-469 
Extinction methods, generalization gra- 
dient, 434 
Extinction schedule, 
schedules, 903 
Extradimensional shift, 487 
Extradimensional training: 
analysis, 503-505 
concurrent experiments, 501-503 
and stimulus control, 497=505 
successive stage experiments, 497-500 
ee cues, in animal psychophysics, 
Extrapyramidal system, and central rein- 
forcement, 572 
Eye blink, Pavlovian conditioning, 61 


: anaes a) 
Eyelid conditioning, rabbit, 495, 496, 498 


288- 


relation to other 


Face washing, hamster, 113-115, 118 
Facilitation: 

by aversive stimuli, 422, 424, 428 

induced behavior, 141, 144 
Facultative behavior: 

definition, 126 

examples, 1535, 140 

grooming, 133 

running, 133-137, 140 
pe oo on periodic schedules, 
FRading procedure: 

in animal psychophysics, 220 

and evvorless learning, 466 

in language Icarning. 640 
HAC 1s trace, negative reinforcement, 468 
False alarms, animal peychophysice, 516, 

£31, 528, 538-554 
Falea positive recponees, animal peychs- 
_ physics, 316 : 

Pasting, SPoRAtion £5 iHsUES eating, OY 33 
Fat solubility, drugs, 549=550 
Fate, drugs in body, 550 
Fatigue: 

and rate of eating, 175 

and schedule control, 23 
Fear: 

as incentive, 100 

Pavlovian conditioning, 315, 393-397 
“Feast or famine,” in large carnivores, 47 
Feature, stimulus control by, 528 
Features analysis, and language, 650-651 
Feedback (see also Positive feedback, 

Negative feedback): 
autonomic conditioning, 610 
in avoidance, and gastric ulcers, 604- 


605 
between behavior and consequences, 
125,144 


behaviors in sequence, 145-148 


672 


Feedback (cont.) 
in language learning, 624 
and schedule-induced behavior, 
145-146 
Feedback theory of reinforcement, 4 
Feeding behavior (see also Eating, Meal): 
cat, 34-35, 38-39, 49 
and dominance, 34 
elicited by brain stimulation, 571, 579, 
581, 583, 589 
environmental constraints, 44, 49 
and FR requirement, 42 
pecking as, in pigeon, 69 
relation to drinking, 35-36, 42-43, 131, 
135, 139 
species differences, 47, 49, 58 
topography, in monkeys, 59 
Feeding pyramid, 47 
Fever, 160-166 
Figure-ground problem, in escape para- 
digm, 383 
Final common path, behavioral, 144 
Fibonacci VI schedule, 215 
First-order deviations, 
schedules, 208 
Fish (see also Goldfish, Siamese fighting 
fish, Stickleback): 
autoshaping, 58 
failure of temporal discrimination, 140 
thermoregulation, 156-158 
5HT (see Serotonin) 
Fixation, in animal psychophysics, 523- 
524 
Fixed action pattern, 15, 17, 23 
Fixed constant number schedule, 226 
Fixed-cycle procedure, shock deletion, 
371-373, 375, 400-402, 404-405 
Fixed-interval schedule: 
absolute response rate, 258, 265 
adventitious reinforcement and punish- 
ment, 185 
avoidance, 367, 377, 384, 388 
central reinforcement, 578 
in chained schedule, 289-298, 335 
chaining on, 222 
compared to fixed-ratio, 206 
compared to variable-interval, 265 
concurrent matching, 252 
conditioned suppression, 341, 346-347 
and contrast, 73 
cyclicities in responding, 208-209 
definition, 202 
delay of reinforcement, 218-219 
drug effects, 188, 192-193, 544, 555, 559, 
560 
dynamic effects, 205-210 
Clectric shock, 178-179, 182-184, 193- 
196 
escape, 380 
and generalization gradient, 452-453, 
458 
Herrnstein’s equation, 265 
induced behavior, 125, 
135-136, 138-142 
local rate of reinforcement, 216 
matching, 242, 247, 252, 254, 255, 266 
negative reinforcement, 367, 384, 388 
observing responses, 318, 320, 322, 325 
peck duration, 67 
polydipsia, 129-130, 138 
punishment, 429 
reinforcement omission, 301 
response-initiated, 301 
response-produced shock, 178-179, 182- 
184, 193-196 
responses per reinforcer, 205-212 
response strength, 265 
in second-order schedule, 300-305 
species differences, 12 


135, 


fixed interval 


129-130, 133, 


temporal patterning, 213-214, 216 
two-state analysis, 265 
Fixed-ratio schedule: 
acquisition of performance, 227 
alcohol consumption, 563 
central reinforcement, 578 
compared to fixed-interval, 206 
component of chain, 292-295 
component of conjoint, 307 
component of second-order, 299-305 
and conditioned suppression, 346, 348- 
349 
in contingent-response experiments, 104 
and contrast, 74 
definition, 202 
drug effects, 188, 192, 543-544, 546, 548— 
549, 551, 556, 559, 562-564 
duration requirement, 227 
dynamic effects, 206, 209-210 
escape-avoidance, 601-602 
induced behavior, 136, 139 
interreinforcer time, 205, 209-210, 217 
limits on size, 38-39, 45, 206, 213 
matching, 246-247, 252, 254, 266 
and meal parameters, 44-45 
and meal patterns, 36-39 
meal reinforcement, 38-39, 45 
negative reinforcement, 367, 385-386, 
390 
observing responses, 320 
and pattern of drinking, 40 
and pattern of running, 40 
pause duration, 209, 213 
peck duration, 67 
response number, 209 
response-produced shock, 182, 194-195 
responses per reinforcer, 209-210, 212 
response units, 225-227 
and schedule-induced behavior, 136, 139 
and shaping history, 177 
stereotypy of responding, 227 
strained performance, 37, 543 
temporal patterning, 213 
and thermoregulation, 164 
Fixed-time schedule: 
adventitious punishment, 184-185 
adventitious reinforcement, 184-185, 
204 
component of chain, 335 
and concurrent matching, 252 
definition, 202 
excitation of pecking, 272-273 
and induced behavior, 126, 128, 130, 133, 
139-142 
shock avoidance, 185 
shock-elicited behavior, 183 
temporal patterning, 140-141, 214 
Flavor (see ‘Taste) 
Flexion response, conditioned 
dog, 483-484 
goats and sheep, 487 
Flicker-fusion frequency, monkeys, 522 
Floor effect: 
on generalization gradient, 471-472, 482 
in generalization of extinction, 445 
Fluphenazine, 545 
Food: 
as discriminative stimulus: 
on fixed schedules, 129, 131, 133 
on variable schedules, 133 
free: 
and contrast, 87-90 
during interim period, 128, 138 
interaction with water, and matching, 
252 
reinforcement, 
484-485 
reinforcer vs. punisher, 178, 197 


and stimulus control, 
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type and palatability, and induced 
drinking, 132-133 
as US in autoshaping, 58 
Food anticipation: 
interaction with running and drinking, 
146-148 
as terminal response, 127, 132-134, 141- 
143, 146-147 


Food economy, and ecological niche, 46—- 
49 


Food expectancy, as induced state, 138 
Food-gathering, gerbils, 115, 118 
Food intake (see also Eating, Feeding): 
and water intake, 35 
and thermoregulation, 162 
Food rate (see also Rate, reinforcement): 
and schedule-induced behavior, 129-138 
Forced-choice procedure, animal psycho- 
physics, 517-518, 524, 537 
Forced contingent response, 102-103, 111, 
181 
Forced motor activity, and central rein- 
forcement, 573 
Forgetting: 
and stimulus control, 457-458 
and thermoregulation, 155 
Formal classes, verbal responses, 637-638 
Formal response unit, 222-923 
Formatives, in language theory, 629, 630, 
648 
Fourth-order deviations, 
schedules, 208 
Free feeding, patterns, 28-36 
Free food schedule (see Fixed time sched- 
ule, Variable time schedule) 
Free operant procedure: 
avoidance, 5, 119, 185, 369-371, 380, 
597-606 
discrimination, 499-505 
Freezing: 
and animal psychophysics, 517 
and conditioned suppression, 351, 353 
rat, 116-117, 407-409 
Frequency, of tone, generalization, 440— 
441, 450, 455-456 
Frequency of reinforcement (see Rein- 
forcement, frequency) 


fixed-interval 


Frog: 
food response in, 121 
psychophysics, 515 
thermoregulation, 157-158 
Frontal cortex, self-stimulation, 590 
Frustration, 23, 85, 468, 509 
Functional relevance hypothesis, 113-118, 
1225125 
Functional theory, language, 628-629, 633- 
640, 651-652 
Functionalism, and language, 640-642 


Galvanic skin response, 61, 493 
Gamma rays, irradiation of rats, 527 
Ganglionic blockade, 611 
Gape response, development, in thrushes, 
19-20 
Gasterosteus aculeatus (see Stickleback) 
Gastric motility, dog, 20 
Gastric ulcer, rat, 597, 604-605 
Gastrocnemius muscle, 607 
Gastrointestinal changes: 
behaviorally induced, 604-605, 608 
human, and avoidance, 597 
Gaussian distribution, and generalization 
gradients, 449 
General activity: 
and estrus in rat, 162 
and induced behavior, 134 
and thermoregulation, 161-162, 165-168 
Generalization (see also Stimulus generali- 
zation): 
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and animal psychophysics, 526-534 
of extinction, 445 
language behavior, 619, 621 
maintained (see Maintained generaliza- 
tion) 
mediated, 509 
problems with concept, 433 
semantic, 634 
Generalization gradient (see also Stimulus 
generalization gradient): 
animal psychophysics, 526-534 
and chained schedule, 290-291 
and conditioned suppression, 351 
inhibitory, and contrast, 82, 85 
LSD effects, in rats, 554-555 
maintained, 529-530 
postdiscrimination, 451, 493-494 
and stimulus control, 482 
Generalization methods, animal psycho- 
physics, 527-530 
Generative grammar. 
647-649 
Generative property, natural language, 
621-622, 636-637, 639-640, 644-647 
Generative semantics, 649 
Generic extension, in verbal behavior, 634 
Generic nature, stimulus and response, 30 
Genetics: 
and language, 640-641 
species differences in avoidance, 407-408 
Geometric VI schedule, 215 
Gerbil: 
alert posturing, 115-116 
classical conditioning, 115-116 
digging, 115-116 
drinking, 98, 118 
eating, 98, 114 
electrical brain stimulation, 
meal patterning in, 35 
paper shredding, 98, 110-111, 114, 118- 
119 
punishment, 115 
running, 98, 114, 118 
self-stimulation of brain, 589 
Gestalt: 
law of common fate, 623 
in verbal behavior, 637 
Gill extension, Siamese fighting fish, 116 
Glandular response, instrumental condi- 
tioning, 609-610 
Globus pallidus, self-stimulation, 590 
Glucagon, and electrical brain stimula- 
tion, 579 
Glucose: 
concentration, and response strengih, 
259-960, 263, 279=?80 
reference, and insulin injection, 557 
“Glue,” reinforcement as, 99, 101 
Gnawing, elicited Dy brain stimulation. 
581 
Goal-directed behavior, disruption by le- 
sions, 586-588 


theory. 628-633. 


1li-112 


Goat: 
latent inhibition, 487 
self-stimulation of brain, 589 
Golden hamster (see Hamster) 
Goldfish: 
color vision, 517, 522 
peak shift, 456 
self-stimulation of brain, 590 
shape perception, 529 
thermoregulation, 153, 158 
Go/no-go method, animal psychophysics, 
515-516, 524 
Good news, and conditioned reinforce- 
ment, 319, 336 
Goose, greylag, egg retrieval, 23 
Gonadal hormones (see Sex hormones) 


Gradient: 
generalization (see Generalization gyra- 
dient, Stimulus generalization gra- 
dient) 
reinforcement, 217-219 
Grammar, theory, 628-633, 647-649 
Grammatical items, language theory, 630 
Grasp response, human, development and 
recovery, 19-20 
Grimacing, human retardate, 564 
Grooming: 
on concurrent schedules, 246 
as facultative behavior, 135, 140 
as interim activity, 17, 137 
normal, description, 17 
Growth: 
and meal patterning, 38 
and meal size, 35 
GSR (see Galvanic skin response) 
Guinea pig: 
caloric regulation, 43 
eating behavior, 34-35, 38, 41, 44 
growth, 38, 41 
as herbivore, 48-49 
patterning of drinking, 36 
sexual behavior, 21 
stimulus generalization, 441 
Gull chick, generalization gradients, 435— 
436 
Gustatory pathways, 586 


Habenular nucleus, 586 
Habit, 101 
Habituation: 
to aversive stimulation, 422 
to drugs, 550 
and rate of eating, 175 
Haldol (see Haloperidol) 
Hallucinogens: 
Classification, 545 
and perception, 562 
as reinforcers, 557 
and stimulus control, 555-556 
Haloperidol: 
classification, 544=545 
and self-stimulation of brain, 586 
Hamster: 
contingent-response experiment. 113-115 
digeing, 113, 118 
eating, 113-~115 
face washing, 119-115, 118 
grooming, 113 
rearing responses, 113, 118 
sovabbling, 118-114, 118 
sexual behavior, 113 
thermoregulation, 159 
Handedness, 606 
“ard wiring,” response to stimylys, 121 
Head bobbing, schedule-induced, 12% 
Heart rate: 
and conditioned suppression, 352, 598 
and escape-avoidance, 602 
measurement and recording, 606 
operant conditioning, 607-611 
Pavlovian conditioning, 61, 89 
Heat: 
aS aversive stimulus, 154, 157, 162, 
165-169 
reinforcer 
and drugs, 188-189 
swimming response, 261, 263, 282 
autoshaping, 59, 60, 120, 121 
and thermoregulation, 153-158, 161- 
166 
Hemiplegia, 19 
Herbivores: 
caloric regulation, 47 
feeding patterns, 36, 47-48 
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Heroin: 
addiction, human, 563 
classification, 545 
Hexagonal apparatus, 
Hibernation, 46 
Hierarchy, in language theory, 630, 633, 
643-647 
High carbohydrate 
regulation, 162 
High-fat diet, and thermoregulation, 162 
High protein diet, and thermoregulation, 
162 
Higher-order schedule (see also Second- 
order schedule): 
and behavioral structure, 643 
Hippocampus: 
and avoidance, 605 
self-stimulation, 589 
Hippocampus, dorsal: 
theta waves, 119 
Histamine: 
and curare, 611 
and taste aversion learning, 14 
and thermoregulation, 168 
Histofluorescence technique. and neural 
pathways, BOs 
History: 
individual, as determinant of behavior, 


174-175, 177=178, 184-186, 192. 196—- 
19 


126, 133, 139, 147 


diet, and thermo- 


organism, and ilar Camerata 553 
subject, an aversive conte 499 | 499 
“Hits,” animal psychophysics, 516 
Homeostasis: 
feedback in, 31, 146 
and feeding behavior, 32 
in hypothalamic syndrome, 17 
as motive force, 31 
and temperature regulation, 183, 156, 
160, 164-165 
Homeotherm (see Endotherm) 
a schedulc-induced, in pigeons, 
Hormones 
and avoidance, 599-601, 604-605 
and central reinforcement, 579, 589 
and conditioned suppression, 698-599 
and depletion, 32 
measurement and recording, 606 
and thermore ulation, 161-169 
Huddling, and thermoregulation, 154, 158 
Hue, scaling of, 31-532 
Human: 
alcoholism, 563 
artificial language learning, 699-640, 
613 647 
autonomic conditioning, 607 
aversive control precedes, A\7, A9] 
classical conditioning, galvanic 
response, 493 
concurrent matching, 094 980. 949 949 
concurrent superstition, 234— IB 
in controlled environment, contingent- 
response experiments, 107 
delay of reinforcement, 262, 282 
drug dependence, 557-558, 563 
gastrointestinal changes, and avoidance, 
597 
grasp response, development, 19 
incentive motivation, 636 
language, 619-652 
memory, 639-640 
noise-induced behavior, 418-419 
Pavlovian conditioning, 20 
psychopharmacology, 563-566 
reaction time, 532 
recovery from hemiplegia, 19 
retardates, 564 


skin 
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Human (cont.) 
self-stimulation of brain, 570, 580, 589- 
590 
shock-elicited behavior, 417-420 
temporal discrimination, carbon mon- 
oxide effects, 563 
thermoregulation, 158, 162-164 
verbal behavior, 628-629, 633-640 
visceral conditioning, 607 
Hunger: 
and conditioned suppression, 353-354 
as hypothetical drive state, 188 
and induced attack, 137 
and induced drinking, 132-133, 137-139 
and magnitude of reinforcement, 259 
and matching, 236, 269 
Skinnerian analysis, 32 
Hunting behavior: 
hyena, 49 
lion, 48 
Hydralic models, 11 
Hyena, hunting behavior, 49 
Hyperphagia: 
hypothalamic, 162 
premigration, 46 
Hyperstriatum, self-stimulation, 590 
Hypertension, and avoidance, 601-604 
Hyperthermia: 
avoidance-induced, 606 
human, 162-163 
rat, 162, 165-166 
Hypnotic drugs: 
Classification, 545 
barbiturates, 544-545 


Hypothalamus: 
anterior, and thermoregulation, 154, 
156, 167 
and central reinforcement, 570, 572, 


575-577, 579, 582-583 
and depletion-repletion, 32, 46 


electrical stimulation, reinforcer vs. 
punisher, 178, 180, 18] 
lateral: 


and attack behavior, 14-15, 17 
and autoshaping, 60 
electrical stimulation, 14-15, 60, 111 
feeding and drinking, 14, 16-17, 20, 
32, 35, 46, 155 
lesions, 16-17, 20, 35 
and motivation, 14, 17 
self-stimulation, 585, 589 
and thermoregulation, 155-156 
posterior: 
electrical stimulation, 164, 259 
and thermoregulation, 154, 156, 164 
and thermoregulation, 154-156, 164, 
167-169 
ventromedial: 
lesions, and overeating, 35 
and repletion, 32 
self-stimulation, 589 
Hypothermia: 
cats, 166 
drug-induced, 166-169 
human, 163 
rats, 154, 167-169 
Hysteresis effects, in concurrent matching, 


242, 254 


Iconic mapping, in language behavior, 
620, 647, 650 
ICS (see Electrical brain stimulation) 
Id, 9 
Ideas, and language, 648, 650 
“Idols of the marketplace,” 13 
Iguana, thermoregulation, 157-158, 166 
Imipramine: 
classification, 545 
and errorless learning, 553 


and fixed-ratio behavior, 562 
and schedule control, 189, 557 
Imitation, in verbal behavior, 637, 644, 
647 
Immediacy: 
negative reinforcement 
human, 262 
and response strength, 261-262 
primary reinforcement, and informa- 
tion, 318 
reinforcement 
and matching, 233, 234, 251-252, 266 
and response strength, 260-262, 266, 
281 
Immobility, difficulty in shaping, 177 
Incentive: 
central reinforcement, 580, 582-588 
function of reinforcement, 99-101 
induced, 636 
and peak shift, 458-463 
Incentive contrast: 
and induced aggression, 469 
and stimulus control, 458 
Incentive function, reinforcement, 99-101 
Incentive motivation: 
central reinforcement, 582-589 
verbal behavior, 633, 636 
Incentive stimulus: 
and induced behavior, 128 
Pavlovian conditioning, 62 
polydipsia, 131-132 
Incidental stimulus (see Contextual stim- 
ulus) 
Incremental generalization gradient, defi- 
nition, 445 
Indifference, on concurrent schedules, 243 
Indirect variables, and schedule control, 
204-209, 217, 228-229 
Individual differences: 
aversive control, 422, 425 
COD in matching, 243 
peak shift, 442 
schedule control, 186-187 
Individual subject: 
behavioral pharmacology, 541-542 
intensive study, 2 
Indoleamine, and thermoregulation, 169 
Induced behavior (see Schedule-induced 
behavior) 
Induced states, 
137-138 
Induction (see also Negative induction, 
Positive induction): 
Baconina, 2, 12, 28 
and discrimination theory, 75 
Infant, thermoregulation, 158-160 
Infectious disease, and avoidance, 597-598 
Inferior colliculus, EEG activity, 605 
Inflection, in language, 620, 623, 632, 639, 
643-647 
Inflection ratio, calculation, 342 
Influenza, and thermoregulation, 163 
Information: 
and conditioned 
313-337 
logical problem, 323-324 
in natural language, 620-621 
negative outcomes, 318-326 
Information processing, and stimulus con- 
trol, 458 
Information theory: 
conditioned reinforcement, 322 
language, 640 
Informativeness: 
CS in Pavlovian conditioning, 343 
stimuli in automaintenance, 63 
stimuli in autoshaping, 56-57, 61, 80, 
84 


on periodic schedules, 


303, 


reinforcement, 
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Infusion, intragastric: 
. hutrients, 35 


water, 36 
Ingestion rate (see also Eating): 
curve, 174 
polydipsia, 129-131 
Inhibition: 


absolute vs. relative, 462 
in autoshaping, 61 
in behavioral contrast, 75-76, 82-85 
among behavioral states, 144-148 
by-product of discrimination, 467 
courtship behavior, 121-122 
and dimensional control, 444-447 
and induced activities, 141, 144-148 
among instrumental responses, 115 
latent, 486-487 
among motivational systems, 122 
motor systems, by brain stimulation, 
587-588 
by novel stimulus, 116, 588 
by reinforcement, 264, 272-274 
and self-stimulation of brain, 587-588 
and stimulus control, 5, 432-476 
measurement, 445-447 
terminal response by interim, 142-143 
tests for, 470 
Inhibitory generalization gradient, mea- 
surement, 445-447 
Inhibitory stimulus, definition, 444 
Inhibitory stimulus control, 432-476 
Inhibitory stimulus control: 
determinants, 453-456, 476 
and errorless learning, 470-474 
and schedule variables, 458-466 
Initiation: 
meals, 28-40 
ratio runs, 38 
responding, disruption by lesions, 586- 
588 


Inner-ear defect, in mouse, 517 

Insect, chained reflexes in, 142 

Instinctive act, 15, 93 

Instinctive drift, 3 

Instinctive responses, and theta waves, 119 
Instrumental behavior (see also Operant 


behavior): 
contingencies among, 98-106 
gerbil, 98 
Instrumental conditioning (see Operant 
conditioning) 


Instrumental environment, 98-99, 111, 112 
Instrumental learning, 14, 53 
Instrumental responses (see also Operant 
behavior): 
role as stimuli, 5 
Insulin: 
and choice of sugar solution, 557 
and electrical brain stimulation, 579 
Integration, learned responses, 20, 22 
Integration, levels of (see Levels of in- 
tegration) 
Intensity: 
current, and central reinforcement, 578, 
782 
Shock-elicited behavior, 418, 422, 427 
stimulus 
and overshadowing, 492 
and psychophysical scaling, 532 
and reaction time, 526, 532 
Intensity dynamism _ effect, 
psychophysics, 524 
Intention, 30 
Intention movements, 139 
Interaction: 
activities and states, 144-145 
behavioral states, 144-148 
classical-operant, and conditioned sup- 
pression, 341, 344, 348-352, 358-360 


auditory 
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concurrent schedules, 70-80, 233-282 
(see also Matching) 
dimensions of stimuli, 439 
eating and drinking, 35, 36, 42-43, 131, 
135, 139 
excitation and inhibition, 447-451 
and stimulus generalization, 493-494 
internal and external stimuli, and rein- 
forcement, 589-585, 588-589 
multiple schedule, 71-85 (see also Be- 
havioral contrast) 
controls for, 72 
definitions, 73 
and Herrnstcin’s equations, 266-268 
and inhibitory control, 458-463 
theories, 75-91, 266-275 
among non-contingent responses, 114— 
115 
operant-Pavilovian 
and autoshaping, 53-91 
and conditioned suppression, 341, 344, 
348-352, 358-360 
psychophysiological processes, 608, 610 
611 


punishment paradigm, 428 
among reinforcers, and matching, 252- 
954, 2964-965 
response-shock and shock-shock interval, 
370-371 
running and drinking, 146-148 
sequential: 
behavior and environment, 177 
among induced activities, 144-146 
shock-clicited behaviors, 420-426, 428 
stimuli, and stimulus control, 496, 507 
stimulus-reinforcer, response-reinforcer, 
66-71 
terminal and interim activities, 139-133, 
139-140, 143 
tenu and interim states, 139-140, 
i 


two kinds of peck, 67 
Interburst interval, in contingent-response 
experiments, 110-11 
Interchangeability, stimuli, responses and 
rewards, 13, 91, 112, 113, 122, 407- 
408 
Interdimensional training: 
discrimination 
description, 439-440 
effects, 444 
and peak shift, 454-455 
and stimulus control, 494-497 
Interference hypothesis, conditioned sup- 
pression, 351-353. 357-359 | 
Interfood interval, and sequential inter- 
actions, 147 
Interim activities, 125-148 
Interim activities: 
competition with terminal, 132-133 
and contrast, 273-974 
control of, by food, 131-132, 140 
definition, 126 
examples, 128-129, 137 
motivation, 128, 132, 137-139 
schedule-induced, 125-148 
S4periods, 135-136 
variables affecting, 128-138 
Interim period: 
aversiveness, 139-140 
DRL schedules, 135, 141 
FI and VI schedules, 133, 135-138, 141 
properties, 134-139 


Interim states: 


interaction with terminal, 139-140, 143 . 


properties, 137-139, 144-146 
Interlocking schedule: 
fixed-ratio, fixed-interval, 213 


and ratio responding, 212 
shock postponement, 194 
Intermeal intervals, 32-40 
Intermittent reinforcement (see also Par- 
tial reinforcement): 
central reinforcement, 573-574, 578-579 
and polydipsia, 131-132, 135 
and schedule control, 201-202 
thermoregulation, 164 
Intermittent shock schedules, 
384-386, 397-398 
Internal clock (see Behavioral clock) 
Internal economy, 596 
Interoceptive discrimination, 610 
Interocular transfer, line tilt, 598 
Interpeduncular nucleus, 386 
Interreinforcement time: 
concurrent choice, 327-337 
differentiation schedules, 994 
fixed-ratio schedules, 205, 209-210, 217 
interval and time schedules, 914 
and response rate, 210 
and responses per reinforcement, 910- 
211 
as schedule variable, 204-905, 209-911, 
214, 228-229 
variable interval schedules, 214-215 
Interresponse time: 
and concurrent choice, 255-256 
conditionable response unit, 224 
contingent-response experiments, 109 
drug effects, 559-560 
and generalization gradient, 437-438 
as reinforced response, 223-224 
as response unit, 223-294, 263 
role in response rate, 258, 763-265 
as schedule variable, 204, 224-225 
as stimulus, 223 
theoretical unit, 223-225 
unit of behavior, 222-225, 263 
Interresponse time schedule (see 
Differentiation schedule, 
schedule, DRO schedule): 
concurrent, and matching, 251, 255-256 
definition, 203 
regenerating property, 212 
and response strength, 263-264 
Intertrial interval: 
autoshaping, 57-58, 06, 78, 127 
escape procedure, 
food deliveries during, 63, 67-68 
Interval schedule (see alse Fixed-interval, 
atiable-interval, Random-inter- 


Vi0=d Che 


also 
DRL 


val): 
definition } BOP 
regenerating power, 212 
and time schedule, 214 
Intestinal contractions, operant condition- 
ing, 571, 608 
Intestinal load, 33 
Intracranial stimulation (see Electrical 
brain stimulation) 
Intradimensional shift, 487 
Intradimensional training: 
description, 439-440 
and errorless learning, 466 
and peak shift, 453-454 
and stimulus control, 493-494 
Intrapeduncular nucleus, stimulation, 585 
Intraperitoneal route, drug administra- 
tion, 545 
Intraverbal, in verbal behavior, 634, 646- 
648 
Invariance, schedule performances, 187 
Inverse hypothesis, stimulus control, 507- 
508 
IRI (see Interreinforcement interval) 
IRT (see Interresponse time) 
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IRTs/Op, and stimulus generalization, 
437 

Islets of Langerhans, 10 

Isobias function, animal psychophysics, 


Isocarboxazid, 545 

Isomorphism, autoclitic and transforma- 
tional theories, 648-649 

Isosensitivity curve, animal psychophysics, 
533-594 

Iteration, theory of grammar, 629 

ITI (see Intertrial interval) 


Jumping stand, 497 


ki: 
Herrnstein’s amount of behavior 
constancy, 262-263 
definition, 957 
Hull’s incentive, 99-100 
Kantian a priovis, and verbal behavior, 
638, 641 
Kernel sentences, language theory, 629- 
630, 632-6335, 643, 648 
Key pecking (see Pecking, Pigeons) 
Key pressing, human, and matching, 239 
Kidney, excretion of drugs, 550 
Kidney function, instrumental condition- 
ing, 609-610 
Kinesis, 158 


Language: 
acquisition, 619-627 
artificial, 639-640, 645-647 
operant analysis, 5, 619-627, 688-640 
psychology of, 619-652 
theories, 619-652 
Latency, response, 
and interreinforcer time, 21] 
and stimulus intensity, 526-527, 532 
Latent inhibition, 486-487 
Law of availability, 47-48 
Law of common fate, 623 
Law of effect, 999-282 
Law of effect: 
and absolute response rate, 757-263 
autoshaping, 53-54, 62-63 
correlation-based, 111] 
as equilibrium principle, 141 
paras analysis, 4, 12; 140, 255- 
Thorndike’s statement, 99 
symmetrical, 1G1-102, 116 
weak, 100=101, 118, 122 
Laws of behavior; self-sufficiency. 28 
Lawes of operant behavior, 148 
Laws of reflex ctreneth, 30 
Learned releasersi 
and autoshaping, 69, 10 
and Pavlovian conditioning, 70 
Learning: 
artificial language, 699-040, 643-047 
constraints on (see Constraints) 
drug effects, 560-562 
as modification of habitat, 49 
natural language, 619-627 
two types, 54 
Learning disability, human, and drugs, 
564 
Learning effects, animal psychophysics, 
525 
Leash-pulling, squirrel monkey, 183-184 
Lesions (see Brain damage) 
Leucocytes, and pyrogens, 166 
Levels of function, nervous system, 8 
Levels of integration, of operant, 7-24 
Lever contact response, rats, 60, 65, 81- 
82, 120 
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Lever pressing (by rats, for food, unless 
otherwise stated): 
acquisition, 561 
and aversive control, 421-425, 428 
avoidance, 116-118, 185, 194-195, 260, 
267-268 
cebus monkey, 102 
college students, 103, 107 
conditioned suppression, 89, 341-356 
and contrast, 73-74, 81-82, 89-91 
drug effects, 561 
escape, 261-262, 282, 319, 367-368, 383- 
387 
and experience, 175, 177 
in extinction, 205 
hamster, 114 
heat reinforced, 154, 162, 164 
heat avoidance, 165, 168 
and induced drinking, 132, 138 
as interim activity, 133 
macaque, 258 
monkey, 89, 178, 185, 194-195, 352 
pigeon, 73, 82, 91 
postreinforcement pause, 142 
and reinforcement immediacy, 260-263, 
281 
and reinforcement magnitude, 258-259, 
278-282 
rhesus monkey, 352 
shaping, 177 
squirrel monkey, 421-425, 428 
stimulus generalization, 437 
and sugar concentration, 259-260, 263, 
269, 279-280 
as terminal response, 133, 145-146 
and thermoregulation, 164-165, 167 
token reinforcement, 306 
Lexeme, 620 
Lexical items, 630 
Lexical system, 620 
Lexicon: 
definition, 620 
use, 648 
Librium (see Chlordiazepoxide) 
Licking (see also Drinking, Polydipsia, 
Schedule-induced behavior): 
air, by rats, 138 
in automaintenance, 65, 120 
avoidance response, in rats, 118 
contingent-response experiments, 104, 
108 
as interim activity, 133 
rate, and polydipsia, 130-131 
as terminal response, 133 
Light intensity (see Brightness) 
Limbic pleasure area, 582 
Limbic system, and central reinforcement, 
572, 585 
Limited availability schedule, and induced 
behavior, 131-132, 135-136 
Limited hold procedure, animal psycho- 
physics, 527 
Limited opportunity schedule, and nega- 
tive reinforcement, 398-406 
Limulus, 9 
Lindsley manipulandum, 602 
Linear VI schedule, 215 
Line orientation: 
generalization, 440, 441, 445, 454-455, 
458, 482, 494, 501-502, 527 
and visual vertical, 527 
Linguistics, 619-627, 628-652 
Lion: 
feeding behavior, 47 
hunting behavior, 48 
Listener, and verbal behavior, 649-651 
Lithium carbonate, 545 
Lithium chloride, 484 


Lizard, thermoregulation, 153, 157-158, 
161-164 
Local contrast (see also, Transient con- 
trast): 
additivity theory, 86-88 
definition, 77 
and Herrnstein’s equation, 268 
necessary conditions, 78 
and overall contrast, 77-78, 86-88 
pigeons, 75-78, 85-88 
rats, 78, 85 
Localizability (see also Diffuseness): 
stimulus, and contrast, 82 
US in autoshaping, 60, 69 
Localization, stimulus on manipulandum, 
490 
Local rate of reinforcement: 
concurrent schedules, 246-247, 272 
fixed-interval schedules, 216 
and schedule control, 214-218 
variable-interval schedules, 214-215 
variable-ratio schedules, 217 
Local rate of response: 
Poeun schedules, 235, 238, 244-247, 
274 
multiple schedules, 272 
and probability of reinforcement, 214— 
216 
second-order schedules, 302-303 
Lock and key, courtship sequences, 142 
Locomotion: 
amphetamine effect, 554 
and aversive control, 418-426 
Locus coerleus: 
and central reinforcement, 586 
self-stimulation, 590 
Logarithm: 
drug dosage regimen, 551 
in matching equation, 238-239, 243, 
247-250 
Long box, autoshaping with, 63 
Lordosis response, rodents, 21 
LSD: 
classification, 545 
and stimulus control, 554, 556 
Luminal (see Phenobarbital) 
Lysergic acid (see LSD) 


Macaca nemistrina (see Macaque) 
Macaque: 
EEG activity, 605-606 
magnitude of reinforcement, 258, 263, 
278-280 
thermoregulation, 153, 159 

Mach bands, 452 

Macrosaccadic eye movement, 237 

Magnesium pemoline, 545 

Magnitude of reinforcement (see Rein- 

forcement, magnitude) 

Maintained generalization procedure: 
animal psychophysics, 529-530 
description, 434-435 

Maintenance (see Automaintenance) 

Mammals, thermoregulation, 158-162, 167 

Mand, 633, 636-637 

Manual manipulation, and aversive con- 

trol, 418-430 

Marijuana: 
behavioral pharmacology, 545, 552 
reinforcer, 557 
and stimulus control, 555 

Markov process, and behavioral sequences, 

142 

Marplan (see Isocarboxizid) 

Masking: 
airflow by key stimuli, 491 
line orientation by color, 482-483, 491 
and overshadowing, 492-493, 500 
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and stimulus control, 482-483, 491-492, 
500, 504, 506 
tone by key stimuli, 491 
Massed extinction, and peak shift, 456-457 
Masseter muscle, 417-418 
Matched shock, extinction procedure, 379- 
380 
Matching: 
and behavioral contrast theories, 272- 
275 
concurrent chained schedules, 297-298, 
329-333 
concurrent schedules: 
evidence, 239-243 
generality, 243, 248-256, 275, 276-277 
vs. Maximizing 245-246 
measurement, 235, 248 
vs. multiple schedules, 272-275 
necessary conditions, 234, 242-244 
negative reinforcement, 254 
and punishment, 253-254 
and reinforcement immediacy, 251- 
252, 329-333 
and reinforcement magnitude, 233- 
234, 248-251 
discrete trial procedure, 237, 244-246 
humans, 234-239, 242, 243 
and IRT reinforcement, 225 
and k-parameter, 262-263 
and multiple schedules 
and component duration, 269-272 
conditions for, 269 
and contrast, 79-80 
and Herrnstein’s equations, 260-270 
negative reinforcement, 373-374, 382- 
383, 386-388 
responses, concurrent schedules, 
237, 242-244, 272-974 
responses and time, 237 
concurrent schedules, 
250, 255 
multiple schedule, 272 
time: 
concurrent schedules, 238-239, 246- 
248, 254, 272, 274 
in interresponse class, 264 
Matching law: 
deviations from, 242-243, 256, 266, 270- 
272, 276-277 
as empirical law, 275-276 
equations, 233, 236, 238, 245, 248, 253, 
255, 257, 275, 277, 329 
generality, 243, 248-256, 275-277 
as intuitive assumption, 275-276 
linear, 233-234 
proportional ratio, 238-239, 242, 247- 
249, 252-257, 277 
as tautology, 276 
as theoretical law, 276 
Matching to sample: 
animal psychophysics, 518, 525 
and conditioned reinforcement, 307-308 
pigeons: 
chained schedule, 292 
second order schedule, 301 
and verbal behavior, 638 
Mathematical analysis, Law of effect, 4, 
132, 140, 233-282 
Mathematical model: 
avoidance, 365 
generalization gradient, 449 
Mathematical theory, discrimination 
learning, 448-451 
Maximization, energy yield /time expended, 
47-49 
Maximization function, 47-49 
Maximizing vs. matching, on concurrent 
schedules, 245-246, 254 


235= 


238, 246-248, 
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Maze: 
central reinforcement, 572 
contrast effects, 86 
drug effects, 555 
Meal: 
criterion, 36 
duration, 28, 33, 42 
frequency, 28, 33-38 
initiation, 32, 36, 38, 40 
patterning, 28-30, 32-37, 42-45 
and polydipsia, 130-132 
size, 32-35, 37-38, 130-132 
termination, 32, 36, 38, 45 
as unit of analysis, 31-32, 34, 45-46, 49 
Measurement: 
central reinforcement, 572-574 
conditioned suppression, 348-351, 360 
ee 169, 192=195; 202-941, 549= 


physiological changes, 606 

preference structure, 108-110 

sensory thresholds, 515-525 

stimulus control, 4393-436, 489 493 
oe models, animal behavior, 29- 


Mechanism, behavioral 
pharmacology, 551-560 
Mechanism of action, 
cology, 541 
Medial forebrain bundle: 
central reinforcement, 577, 585, 589, 607 
and thermoregulation, 156 
Medial geniculate, EEG activity, 605 
Mediated generalization, 509 
Mediating responses, avoidance 
364, 397, 400 
Mediation: 
in autonomic conditioning, 607-608, 610 
in syntax learning, 624 
Mellaril (see Thioridazine) 
Memory: 
and stimulus control, 457-458 
and verbal behavior, 637-638 
Mentalese, and verbal behavior, 635, 648- 
651 


ane psycho- 
psychopharma- 


theory, 


Mentalism, and language, 640-642 
Mentalistic concepts, how avoided, 29 
Meperidine, 545 
Meprobamate: 
classification, 545 
and conditioned suppression, 356 
and schedule control, 189, 191, 557 
Mercury vapor, behavioral cffects, 569-563 
Meriones unqguiculates (see Gerbil) 
Mescaline, 545 
Mesocricetus auratus (see Hamster) 
Metabolic rate, and thermoregulation, 154 
161-162, 168 
nemeal Nee extension, verbal behavior, 
63 


J 


Metastability, response patterns, 192 
Methadone: 
classification, 545 
and heroin addiction, 563 
Methanol, 549 
Methamphetamine: 
Classification, 545 
and punishment, 560 
and schedule control, 559 
and ratio strain, 543 
Methaqualone, 545 
Methedrine (see Methamphetamine) 
Method of constant stimuli, animal psy- 
chophysics, 521-522 
Method of limits, animal psychophysics, 
520-522 
Method of successive approximations, in 
shaping behavior, 54, 62 


Methodology: 
central reinforcement, 572-574, 589 
and schedule control, 204-229 
Methylphenidate: 
classification, 544-545 
and learning disability, 564 
Metonymic extension, verbal behavior, 
634, 637 
Metronome, 138, 484 
Microstructure, stimulus 
gradient, 437-439 
Midbrain, and electrical stimulation, 583, 
585 
Milieu externe, 33 
Milieu interne, 33 
Milieu therapy, 565 
Miltown (see Meprobomate) 
Minimal unit hypothesis, 67-68 
Minor tranquilizers, 545, 555 
ea ee and interocular transfer, 


generalization 


Misbehavior of organisms, 3, 7, 13, 23 
Misses,” animal psychophysics, 516 
Mixed schedule: 
and chaining, 991-999 
conditioned reinforcement, 
393-394, 336 
and contrast, 74 
drug effects, 556 
observing responses, 320-321, 
Model: 
central and conventional reinforcement, 
584-585 
as method of synthesis, 10 
Modulation; 
behavior 
a coors shock, 180, 183, 193, 195- 


320=321, 


323, 324 


by environment, 410 
by food presentation, 182 
response rate, chained schedules, 293, 
299 
Modulus, for measuring behavior, 262 
Molecular weight, drugs, 549 
Momentary response probability, 108-109, 
112 


Mongolian gerbil (see Gerbil) 


' Monkey (see also Squirrel monkey, Rhesus 


monkey, Cebus monkey, Macaque) 
Monkey: 

auditory intensity discrimination ROG 
curves, 534-535 

auditory thresholds, 516, 519, 521 

avoidance, $966, 406 

central reinforcement, 578 

cocaine reinforcement, 200-251 

color vision, 512 

conditioned suppression, 89, 352, 358- 
360 

dark-reared, 486 

equal-loudness contour, 526 

flicker discrimination, 522 

matching on concurrent schedules, 950— 
25] 

motion aftereffect, 523 

negative reinforcement, 390, 395 

observing responses, 320-321 

pica, 137 

positive conditioned suppression, 358- 
360 

schedule-induced drinking, 138 

second-order schedule, 302 

self-stimulation of brain, 578, 589-590 

spiral aftereffect, 529 

stimulus generalization, 490 

tactual discrimination, 517 

thermoregulation, 164, 167 

visual fixation training, 523 
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Monoamine oxidase inhibitors, and self- 
stimulation of brain, 585 
Monochromatic light, rearing, and gen- 
eralization, 485-486 
Mood, and _ schedule-induced behavior, 
137-138, 144 
Morphine: 
classification, 545 
dependence, 179, 554, 557 
and fixed-ratio behavior, 551 
and punishment, 558 
Morphonemic rule book, in language be- 
havior, 648 
Motility, gastric, dogs, 20 
Motion aftereffect, monkey, 523 
Motivation: 
and spas suppression, 393-358, 
60 
drug effects, 188-189, 559-562 
and electrical brain stimulation, 570, 
576, 579580, 589-889 
feeding behavior, 49 
homeostasis as, $1 
and interim activities, 128. 152. 
and matching, 565 
aS maximization function, 49 
and the operant, 12-13, 15, 19, 33-88 
problems with concept, 14-15 
and schedule-induced behayier, 128- 
132, 137-139, 144 
and schedule performances, 188 
Motivation hypothesis: 
supe ucued suppression, 351, 353-357, 


137-159 


induced drinking, 130, 132 
Motivational properties, induced states, 
158-140 
Motivational state, as behavioral state, 144 
Motor behavior, and self-stimulation of 
brain, 586=588 
Mouse: 
auditory sensitivity, 517 
biting attack, 416-417 
brain biochemistry, 606 
conditioned suppression, 517 
Killing, by rat, 17 
meal patterning, 55 
neurological mutant, 517 
thermoregulation, 153, 159, 167 
Muller-Lyer illusion, 526 
wigs causation, verbal behavior, 646— 
6 


Multiple sondtivrent schedule, matching, 


249 
Multiple schedule: 
animal psychophysics, 515-516 
avoidance, BEY-BF i 
behavior pharmacology, 551-552, 556- 
007, 559 
céntral reinforcement, B75 
and chained schedule, 2790 
chained and tandem, 294 
component duration 
and contrast, 79, 87, 90 
and matching, 269-272 
negative reinforcement, 382-383 
and conditioned reinforcement, 317, 
323-324, 336 
conditioned suppression, 345, 347 
contrast on, 75-91, 266-275 
drug effects, 188-191, 551-552, 556-557, 
559 
and Herrnstein’s equations, 266—268, 
270-272, 373-374 
induced behavior, 132, 136-137, 138 
interactions on (see Interaction) 
matching, 269-272, 373-374 
negative reinforcement, 367, 369, 381- 
387 
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Multiple schedule (cont.) 
and peak shift, 441-443, 458-463 
relation to concurrent, 78-80, 266-270 
relative response rates, 268 
responses per reinforcer, 211-212 
response-produced shock, 195 
Multiple-stimulus method, generalization 
gradient, 434 


Nalorphine: 
and conditioned suppression, 554 
and morphine dependence, 179, 557 
reinforcer vs. punisher, 179, 192 
Naloxone, 557 
Narcosis, cold, 157 
Narcotics (see also Opiates, Analgesics): 
Classification, 545 
as reinforcers, 557 
tolerance, 550 
Nardil (see Phenelzine) 
Nativism, and language, 641 
Naturalistic environments, behavior in, 4, 
28-50 
Natural selection, in language learning, 


NE (see Norepinephrine) 
Necessity, as driving force of behavior, 31 
Neck stretching, as interim activity, 137, 
139 
Negative automaintenance, 13, 23, 63-65, 
128 (see also Omission effect, Omis- 
sion training) 
Negative behavioral contrast, 72-75, 78, 
85-86, 267-268 
additivity theory, 85-86 
definition, 73 
logical problems, 75, 86 
pigeons, 75, 78, 85-86 
and positive contrast, 75, 85-86 
rats, 78, 85-86 
Negative contingencies, in omission train- 
ing, 99, 128 
Negative feedback: 
in homeostasis, 31 
as self-inhibition, 144-148 
in thermoregulation, 153-154 
Negative induction: 
definition, 72-73 
pigeons, 82-83, 89-90 
rats, 74, 82 
Negative reinforcement, 364-410 (see also 
Escape, Avoidance, Electric shock, 
Aversive control) 
access to, 376 
acquisition, 404-409 
added cues, 365, 381-396 
concurrent schedules, 254, 375, 386, 388— 
g92 
contingency vs. frequency, 398-399 
definition, 364 
density, 369-370, 375 
discriminative stimuli, 392-396 
escape procedure, 367-368, 383-387 
extinction after, 377-380 
frequency, 260, 369-370, 398-399 
illustrative experiments, 365-367 
immediacy: 
human, 262, 282 
and response strength, 261-262, 282 
intermittent shock schedules, 375-377, 
384-386 
magnitude: 
and response strength, 261, 263, 281- 
282 
scaling, 387-388 
and matching, 254, 269-270, 373-374, 
382-383, 386-388 
noncontingent shock, 378-380 


and Pavlovian conditioning, 364-365, 
390-393, 395-398, 400 
procedures, 364—406 
shock-delay procedure, 865, 370-371, 
380, 383, 388-391, 400, 402, 407, 409 
shock delay vs. frequency reduction, 
402-406 
shock-deletion procedure, 365, 371-373, 
375, 382, 384, 386-387, 398-401, 403 
shock-density reduction, 375 
shock-frequency reduction, 365, 373-378, 
380-383, 386-387, 392, 400-406 
similarity to positive, 365-368, 373, 382, 
384, 386, 406 
and stimulus generalization, 450 
and supplementary positive reinforce- 
ment, 404—405 
two-factor theory, 364-365, 393, 396-398, 
400 
two modes of, 387-388 
Negative stimulus, and conditioned rein- 
forcement, 318-326 
Nembutal (see Pentobarbital) 
Neocortex, and instrumental autonomic 
conditioning, 608 
Neostriatum, self-stimulation, 590 
Nephrectomy, bilateral, and drinking, 36 
Nervous system, development, 8, 17-19 
Nest building, gerbils, 115 
Neural basis, reinforcement, 571, 574, 580- 
581, 585-589 
Neural pathways, and 
585-588 
Neurochemistry, 
167-169 
Neurons: 
catecholaminergic, 571, 585-588 
temperature sensitive, 154, 158, 160, 166, 
169 


self-stimulation, 


and thermoregulation, 


Neuropharmacology, catecholaminergic 
pathways, 585-586 

Neurophysiological system, transient 
changes, 597 

Neurotransmitters: 


and drugs, 544 
and thermoregulation, 167-169 
Neutral stimulus, after errorless learning, 
467, 471 
Neutral zones, and stimulus control, 453 
Nialamide, 545 
Niamid (see Nialamide) 
Nigrostriatal pathway, 586-588 
Noise escape, 282 
Noise-induced behavior, 418-419 
Nonchaining delay, and gradient of rein- 
forcement, 217-218 
Noncontingent schedule (see Time sched- 
ule 
Nanceuaeeens shock, and negative rein- 
forcement, 378-384 
Noncontingent stimuli, 2 
Nondeprived animals: 
large ratios tolerated by, 36, 38, 43 
meal patterning, 30, 34-35, 45 
Nondifferential reinforcement: 
definition, 433 
and generalized gradient, 436, 439-441, 
454-456 
and overshadowing, 507-508 
and stimulus control, 488—490 
Non-matching to sample, 307-308 
Non-reinforcement, as de-encephalization, 
23 
Nonsense syllables, 639-640, 643-647 
Noradrenalin, and _=self-stimulation of 
brain, 571 
Noradrenergic hypothesis, reward, 571 
Noradrenergic neurons, and _ self-stimula- 
tion of brain, 585-586 
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Noradrenergic pathways, 571, 585-588 
Norepinephrine: 
and escape-avoidance, 605 
and heart rate conditioning, 608 
and stimulus control, 555-556 
and thermoregulation, 167-169 
Nose-key pressing: 
cats, 520 
monkeys, 58 
rats, 516 
ee and temporal patterning, 
21 
Novel stimulus: 
effects on ongoing response, 116 
inhibition by, 588 
Noxious stimuli (see also Electric shock, 
Escape, Avoidance, Aversive control, 
Negative reinforcement): 
and maintenance conditions, 182 
Noyes pellets, 43, 45, 89 
Nursing, rats, 18-19 
Nutritional quality, and eating patterns, 
33, 47-49 
Nuzzling response, chicks, 65 


Obesity: 
and electrical brain stimulation, 14, 579 
experimentally produced, 597 
Object carrying, elicited by brain stimula- 
tion, 571, 582 
Object substitution, Pavlovian condition- 
ing, 62 
Observation method, several responses at 
once, 4, 126-127, 139, 147-148 
Observing responses: 
animal psychophysics, 517-519, 523 
and attention, 30 
and conditioned reinforcement, 313-315, 
318-326, 332, 336-337 
in discrimination training, 508 
maintaining variables, 322 
Obstruction box, and central reinforce- 
ment, 572, 574 
Octopus: 
failure of temporal discrimination, 140 
shape perception, 529 
Oddity problem, animal psychophysics, 
518, 525 
Odorant, stimulus control by, 517, 523 
Olfaction: 
and central reinforcement, 582 
and lateral hypothalamus, 16 
Olfactory bulb, self-stimulation, 589 
Olfactory bulbectomy, and meal fre- 
quency, 35 
Olfactory discrimination: 
pigeons, 517 
rats, 523 
Olfactory pathways, 586 
Olfactory tubercle, self-stimulation, 589 
Omission, reinforcement, second-order 
schedule, 301 
Omission effect (see also Omission train- 
ing, Negative automaintenance): 
automaintenance, 63-68, 128 
Omission training (see also Omission 
effect, Negative automaintenance): 
chicks, 65, 120-121 
dogs, 63 
pigeons, 63-68, 119 
rats, 65 
squirrel monkeys, 65 
stimulus-reinforcer relations, 63-64 
Omnivores: 
caloric regulation, 47 
feeding patterns, 47 
Ongoing behavior: 
and negative reinforcement, 409-410 


Subject Index 


and reinforcement vs. punishment, 176, 
180-182, 191, 197 
Ontogenic contingencies, 580 
Operant (see also Instrumental): 
as behavioral state, 144 
concept, 3, 7-8, 11-14 
as conditionable unit, 229 
criterion of motivation, 13-14 
de-encephalization, 22~23 
definitions, 175, 186 
as emitted, 174-175 
encephalization, 90-91 
as functionally identifiable class, 175 
hierarchical structure, 29-93 
history, 8-11, 28-31, 53-54, 174-175 
and human behavior, 13 
on interresponse time schedules, 255 
levels of integration. 7-94 
physiological thinking and, 8-11 
problems with concept. 7, 13-14, 23 
relation to general psychology, 9, 4-5, 9 
sclf-stimulation of brain as, 580 
Skinner’s justification, 11-19 
and (criminal response, 127 
as unit, 2, 174-177, 999 
Operant analysis, basic assumption, 29 
Operant behavior: 
and chronic somatic changes, 59? 
and classical conditioning, 53=91. 340- 
61 


and conditioned suppression, 340, 341— 
351 


language as, 619-627, 633-640 
laws, 148 
and Pavlovian conditioning. 53-91, 340- 
360 
technical advances. 186-187 
and thermoregulation, 153-169 
Operant conditioning: 
autonomic responses, 606-611 
and autoshaping, 62, 66-67, 70, 120 
history, 8-11, 28-31, 53-54, 174-175 
definition, 175 
parameters, and conditioned suppres- 
sion, 344-348, 351, 360 
and physiological states, 506-611 
relation to Pavlovian, 54-55. 62. 71. 88. 
91, 120 
visceral responses, 606-611 
Operant discrimination, and central rein- 
forcement, 
Operant methods: 
advantages, 2, 5,7 
aversive control, 415=430 Z 
animal psychophysics, 515-557 
central reinforcement, 572-580 
Opiates (see also Heroin, Morphine; Nar- 
COtics); 
classification, 544-545 
self-administration, 545 
Opportunity hypothesis, induced drinking, 
130-132 
Optimal-duration model, response prob- 
ability, 109-115, 129-193 
Optimal level theory, 99 
Optimization, cost-benefit, in feeding, 46 
Order effects, concurrent matching experi- 
ments, 242, 254 
Oral route, drug administration, 545 
Orientation response: 
animal psychophysics, 523 
in autoshaping, 65, 69 
and contrast, 89 
and hypothalamic lesions, 16-17 
Orrery, language model, 623 
Outceme-dependency, instincts and oper- 
ants, 15 
Overextension, verbal behavior, 634 


Overmatching, concurrent schedules, 242, 
244, 248, 254-255 
Overshadowing: 
and blocking, 500 
and masking, 492-493 
necessary conditions, 507 
and stimulus control, 492-493, 496, 499- 
501, 503, 506 
theory, 506-508 
Overtraining, 93 


Paced schedule, and choice, 255-256, 266, 
276-277 
Paced VI schedule, and IRT reinforce. 
ment, 295 
Pacing procedure. 345, 354 
acing response, schedule-induced, 128 
137 
Paired-associates learning, 697 
Pairing: 
eoncurrent chains, and conditioned re- 


; 


US and GS, in autoshaping, 55-55, 137 

Pairing hypothesis, conditioned reinforce: 
ment, 919, 315-918, 935, 994-998 

Palatability of food: 

and diurnal eating c cle, 44 

and induced behavior, 132-133, 138 
Paleosttiatum, self-stimulation, 500 
Pallidum, self-stimulation, 590 
Pancreatectomy, 10 


Panel pressing, dogs, on shock delay sched- 
ule, 390 
Panting, and thermoregulation, 154, 167 
Paper shredding, gerbils, 98, 110-111, 114 
115, 118-119 
Parallels: 
applied to operant, 11 
as method of synthesis, 10 
ee recovery and development, &, 
Lined 
Paraphrase, in language behavior, 649-651 
Parenteral route, drug administration, 545 
Paricto-occi ital activity, cat, 605 
Partial reinforcement (see also Intermit- 
tent reinforcement, Reinforcement): 
advantage of schedules, 999 
and autoshaping, 61 
brain stimulation, 576, 578-575, 585 
and persistence, 500 
Partial reinforcement effect: 
int autosha ping, Gl, G4, 118 
generalized, 509 
operant conditioning, 61 
Pavlovian conditioning, 61 
and resistance to extinction, 261 
Dath-independancze, feeding behavior, 16 
Pattern: 
drinking, $6, 40, 49 
meals, 28=30, 37=37, 42m 
respondin 
and matching, 255-256, 276 
in preshock stimulus, 350 
and schedule performance, 218-99] | 
265 
Pause: 
interresponse (s¢é Interresponse time) 
postreinforcement (see Post-reinforce- 
ment pause) 
Pavlovian conditioning (see also Classical 
conditioning): 
aggressive behavior, 59 
and altered physiological states, 596 
in automaintenance, 62-63 
in autoshaping, 53-55, 58, 60-61, 80-81, 
119-122 
and avoidance, 364-365, 393, 396-398, 
400 
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in behavior contrast, 81, 84-85, 88-89 

blood pressure, 89 

and conditioned suppression, 88-89 

description, 61-63 

dogs, 61, 138, 340, 483-484, 489, 492 

expectations in, 138 

eye blink, 61 

fear, and conditioned reinforcement, 315 

galvanic skin response, 61 

heart rate, 61, 89 

human infants, 20 

mechanisms, 61-62 

and negative reinforcement, 964-965, 
390-393, 395-398, 400 

and operant behavior, 53-9] 

partial reinforcement, 61 

relation to operant, 54-55, 69, 71, gg 


89, 21 
salivation, 9, 53, 61, 138, 240, 480, 403 
sexual behavior, 59 
and shock-correlated stimuli, 3890=391 
stimulus substitution in, 61-— 8 
trace. and shock-delay, 400 
Pavlovian hy éthesis, contingent-response 
experiments, 119-123 
Dawingt 
in autemaintenance, 69 
schedule-induced, 198, 133 
PD (see Pscudediscrimination) 
PDC (see Postdiscrimination gradient) 
Peak shift: 
and area shift. 447, 45% 
and behavioral contrast, 446, 454 
by-product of discrimination, 467 
compounding theory, 459465 
concurrent schedules, 468-464 
determinants, 442, 453-456, +76 
double, 442 
and crrerless learning, 467, +71 
edldfish, AbG 
and interdimensional training, 454=455 
and intradimensional training, 4355454 
and massed extinction, 456 
mathematical constraints, 448-449 
necessary and suffieient conditions, 456 
negative, 443 
and negative reinforcement, 451, 488 
potitive, 44]_443 
and schedule variables, 458-+96 
Spence's theory, 448 446 
tone frequency, pigeons, 441 
wavelength, pigeons, B45, 441448 
Pack, duration: 
in auteshaping, 24, 7-68, 81 
and eantrast, 8728 
on FI and FR, 67 
2QO0d ws. water ranforcement, 5856, &F, 
Bi, 9 
in omission training, 67-68 
Pecking (by pigeons, for food, unlege 
otherwise statcd): 
acquisition, 55, 57, 69-69 
agpressive. in pigeon. 59 
automaintcnance, 62-68 
autoshaping, 8, 7, 13, 94, 54-69, 69_70, 
81, 119 
chicks: 
autoshaping, 59, 60, 65, 120, 121 
thermoregulation, 159-160 
conditioned suppression, 345, 346, 351, 
353, 357 
effect of experience, 177 
errorless learning, 464-480 
extinction, 205 
as interim response, 128, 133, 140, 144 
maintenance, and Pavlovian condition- 
ing, 63, 81, 88, 91, 119, 120 
negative reinforcement, 406-407 
operant vs. reflexive, 67 
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Pecking (cont.) 
as Pavlovian response, 54-55, 58, 63, 69, 
81, 89, 91, 119 

positive conditioned suppression, 359 
postreinforcement pause, 142-143 
aS prototypic operant, 54-55, 71, 100 
rate: 

as controlling stimulus, 504 

as constant, 238—246 
and reinforcement magnitude, 259, 278 
schedule-induced, 128, 130, 140, 144 
shaping, 177 
as terminal response, 126-128, 133, 135, 


140-144 
voluntary vs. reflexive, 54-55, 67 
Pedal pressing, pigeon, observing re- 


sponses, 318, 320 
Pedal pushing, dog, avoidance, 605 
Pentobarbital: 
classification, 544-545 
clock schedule, 555 
extinction effects, 561-562 
fixed-interval behavior, 559 
fixed-ratio behavior, 544, 546, 548-549, 
559, 562 
and punishment, 558 
as reinforcer, 192 
schedule control, 542-543, 546, 559 
and stimulus control, 555 
and thermoregulation, 556 
Peptic ulcers: 
and aversive control, 597, 604—605 
and avoidance procedure, 597, 605 
monkeys, 605 
Percentage reinforcement, brief stimulus 
schedule, 316 
Percentile schedule, and IRT reinforce- 
ment, 224 
Perception: 
animals, 526-530 
and drugs, 562 
he ch learning, and language, 643- 
64 
Performance theory, language behavior, 
629 
Periodicity, responding on FI, 208-209 
Periodic schedules (see also Fixed interval, 
Variable interval, Fixed time, Vari- 
able time): 
and induced behavior, 126-140 
Peripheral resistance, and cardiac output, 
602-603 
Permitil (see Fluphenazine) 
Perphenazine, 544-545 
Persistence: 
and central reinforcement, 576-578, 589 
and partial reinforcement, 509 
running as facultative activity, 147-148 
Pharmacology: 
behavioral (see also Behavioral phar- 
macology, Drugs, Psychopharmacol- 


ogy): 
behavioral, 5, 188-193, 202, 540-569 
and conditioned suppression, 349, 350, 
355-357 
thermoregulation, 167-169 
Phenelzine, 545 
Phenobarbital, 544-545 
Phenothiazine: 
classification, 544-545 
and conditioned suppression, 356 
and stimulus control, 555-556 
Philanthus triangulum (see Wasp) 
Phone, element of language, 629 
Phonology, component of language struc- 
ture, 629, 632 
Phrase marker, in language theory, 629- 
630 
Phylogenic contingencies, 29, 580 


Phylogeny of aggression, 417 
Physiological psychology, 
operant, 5 
Physiological states: 
altered, 596-618 
durable, 597-598 
experimental alteration, 596-611 
measurement, 606 
technological and methodological devel- 
opments, 606 
transient, 597 
Physiologizing, 9 
Physiology vs. operant in analysis of be- 
havior, 8-12 
Pica, schedule-induced, 23, 137 
Pig: 
rooting response, 13 
thermoregulation, 153, 158-159 
Pigeon: 
absolute response rate, 257-258 
adventitious punishment, 184 
adventitious reinforcement, 185 
ageressive behavior, 23, 59, 120, 136-137, 
139, 468-469, 552 
auditory stimulus control, 485, 487-491, 
494—495 
automaintenance, 62-68 
autoshaping, 13, 23, 54-62, 68-70, 80-81, 
119 


relation to 


avoidance, 13, 120, 450-451 
behavioral contrast, 76-78, 84-86, 266— 
275, 442, 454 
blocking, 499-500 
brightness discrimination, 521, 532 
central reinforcement, 572, 589 
chained schedule, 291-298 
classical conditioning, 489 
clock schedule, 555 
COD in matching, 243 
conditioned suppression, 345-346, 351, 
353, 357, 517 
concurrent chained schedule, 327-337 
concurrent schedule: 
brief stimuli, 307-308 
matching, 236-256 
concurrent superstition, 234-235 
conditioned enhancement, 89 
conditioned reinforcement, 318-337 
conjoint schedule, 307-308 
dark adaptation, 521-522 
delay of reinforcement, 219 
drug effects, 552-555, 559-560, 562-563 
electrical brain stimulation, 252 
errorless learning, 464-480 
escape, 254 
fixed-time schedule, 126-127 
FR FI performance, 187 
interocular transfer, 528 
interresponse time as basic, 263-264 
key pecking (see Pecking) 
matching: 
concurrent vs. multiple, 272 
concurrent schedules, 236-256 
multiple schedules, 269-272 
and reinforcement immediacy, 251- 
252, 266 
meal patterning, 35 
negative contrast, 78, 85-86 
negative induction, 82 
negative reinforcement, 386-391, 406— 
407 
observing responses, 318-326 
oddity problem, 518 
olfactory sensitivity, 517 
overshadowing vs. masking, 492-493 
pecking (see Pecking) 
polydipsia, 129, 136 
positive conditioned suppression, 359 
positive contrast, 73-74, 78, 81-82 
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punishment, 65, 253-254 
reinforcement magnitude, 248-251, 259 
response-produced stimuli, 221-2292 
schedule-induced behavior, 126-129, 
136-139 
second-order schedules, 299-306 
self-stimulation of brain, 572, 589-590 
stimulus generalization, 434-443, 450, 
454-456, 458, 482, 489, 490, 493-496, 
501-502, 527-528, 530-532 
time discrimination, 218, 219 
treadle pressing, 82, 89-90, 236 
VI vs. VR performance, 210 
visual acuity, 517-518, 521, 
wild, 236 ; 
_ wing flapping, 59 
Piloerection, and conditioned suppression, 
35] 
Pimozide, 586 
Piperazine, 544 
Placebo effect, 554 
Plasticity principle, response units, 222 
Pleasantness, thermal stimuli, 162-164 
Poikilotherm (see Ectotherm) 
Polydipsia (see also Drinking, Licking, 
Schedule-induced behavior): 
ethanol, in rats, 561 
psychogenic, 23 
schedule-induced: 
behavioral determinants, 129-131 
and body weight, 132 
chimpanzees, 129 
control by eating, 129-135, 138 
doves, 129, 136 
measurement, 129-130 
motivation hypothesis, 130, 132, 138 
pigeons, 129, 136 
and postprandial drinking, 129-131 
rats, 129-130, 132, 136 
regulatory mechanisms, 129 
squirrel monkeys, 129 
stimulus control of, 129, 131 
theories, 130-132 
Polyribosomes, mouse brain, 606 
Population, matched to available re- 
sources, 33 
Positive behavioral contrast: 
definition, 73 
and induced drinking, 132, 138 
and matching low, 266-268, 273-274 
and negative contrast, 75, 85-86 
pigeons, 73-74, 77-78, 80-83, 266-268, 
273-274 
and punishment, 77 
rats, 72-74, 78, 82-86 
Positive conditioned suppression: 
and autoshaping, 359 
and Pavlovian conditioning, 88-90 
and punishment, 357-360 
schedule effects, 359 
Positive feedback, in autoshaping, 70 
Positive induction, 72-73 
Postdiscrimination gradient, 451, 493-494 
Posterior marginal gyrus, EEG activity, 
606 
Post food period, and polydipsia, 131 
Postprandial drinking: 
and matching, 252 
and polydipsia, 129-131, 135 
Postreinforcement pause: 
and acceleration of response, 142-143 
conditionable unit, 226-228 
fixed-interval schedules, 213, 216 
and polydipsia, 131 
ratio schedules, 42, 209, 213, 217, 226- 
228, 293, 346, 355 
and ratio size, 209, 212-213, 217 
and rate of responding, 142-143 
tandem schedule, 258 
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Power function: 
concurrent schedules 
generality of, 255, 256, 277 
VI vs. FI, 255 
human reaction time, 532 
and temporal differentiation, 224, 227 
visual intensity, 532 
Power function matching, equations, 255, 
277 
Power law, sensory magnitude, 530 
Prandial drinking, rats, 18 
PRE (see Partial reinforcement effect) 
Pre-avoidance period, cardiovascular 
changes, 602-603 
Frebehavioral time, and language, 649 
Predictability, shock, and gastric ulcer, 
604 
Prediction: 
as method of synthesis, 10 
outcomes of instrumental contingencies, 
98-101, doe lee 
Predictiveness (see also Informativeness): 
stimulus, in autoshaping, 63, 80, 84, 127 
and terminal response, 126-128 
Preening: 
as facultative behavior, 135, 140 
as interim activity, 137, 146 
Preference: 
and avoidance, 366 
central vs. conventional reinforcement, 
577-578 
central reinforcement sites, 572-573 
color, in pigeons, 524 
among concurrent schedules, 233, 235 
among events in world, 101-110 
FI over chained schedule, 335 
and incentive, on multiple schedules, 
459 


multiple over mixed schedules, 336 
position, and animal psychophysics, 518, 
924,532 
response, in animal psychophysics, 516, 
518, 521, 524 
saccharine concentration, 106 
scaling of, concurrent schedules, 298 
shorter IRTS, 256 
stimulus 
in animal psychophysics. 524 
and generalization gradients, 435-496, 
440, 442 
thermal 
animals, 153, 158-159 
humans. 162-164 
VI over FI schedules, 255, 339 
wavelength, effect of rearing, 486 
Preference function, and generalization 
eradient, 435-436 
Preference procedure, central reinforce- 
ment, 574 
Preference structure, measurement, 108— 
110 
Pre-loading, and induced drinking, 138 
Preoptic area, and thermoregulation, 154- 
156, 167-169 
Preparedness: 
and animal psychophysics, 520 
and negative reinforcement, 407-408 , 
Primates (see also Monkeys, specific pri- 
mates): 
autoshaping, 58, 60 
Priming, and central reinforcement, 576 
Primitives, in verbal behavior, 647-648 
Probabilistic schedule, shock-frequency re- 
duction, 375 


Probability: 
positive stimuli, and observing re- 
sponses, 322-324 
reinforcement: 


on concurrent schedules, 234, 237, 
244-245 

and conditioned reinforcement, 326— 
327 


and schedule control, 214-215 
responses, and reinforcement vs. punish- 
ment, 181 
shock, and shock-frequency reduction, 
374-377 
of US, in classical conditioning, 315, 
343-344 
Probability, differential rules (see Differ- 
ential probability rules) 
Progesterone, and thermoregulation, 162 
Programmed instruction, 7, 467 
Progressive ratio schedule, and central re- 
inforcement, 573-574 
Prolixin (see Fluphenazine) 
Promazine, and clock schedulc, 555 
Prompting, in language learning, 639-640 
Proportional ratio matching, 238-239, 
242, 247-249, 252-257, 277 
equations, 238, 248 
Propranolol. and blood pressure, 603 
Protein, component of self-selected diet, 
a. 
Proximity: 
food and water, and induced drinking, 


to reinforcement in time: 
in chained schedule, 980, 903 


and pene reinforcement, 326—- 
39 


and Herrnstein’s equations, 265 
and induced behayior, 127, 140 
and interval relativity, 221 
stimulus to reinforcer, in autoshaping, 
| eae 
Pesudon ven minalen procedure, 495-505, 
BOIS 
Psilocybin: 
classification, 545 
and stimulus control. 556 
Psychiatry, and drugs, 540, 542, 563-566 
Psychoanalysis, hydralic models, 11 
Psychocndocrine studies, 97-604 
Psycholinguistics, 619-627, 628-652 
Eyeenaeal distance, to reinforcement, 
$35 


Psychological independence, 
of discrimination, 439 
Psychology of language, 619-652 
Psychopharmacology (see alse Behayioral 
— pharmacology, Drugs): 

acquisition and extinction. 560-562 

antecedent variables, 553-554 

and behavioral meehanisms. 541, 551- 


dimensions 
40 


consequence variables, 556-559 
and drug abuse, 542, 557-558, 563 
human, 563-566 
motivation, 188-189, 559-560, 562 
and schedule control, 542-543, 546, 551- 
552, 555-560 
and self-stimulation of brain, 571, 580 
sensation and perception, 562 
stimulus variables, 554-556 
thermoregulation, 153-155, 
188-189, 556 
Psychophysics, animal, 515-537 
methods, 515-532 
signal rece? theory in, 5, 516, 532- 
53 
Psychophysics of association, 70 
Psychophysiology, classical, 596-597 
Psychosomatic disorders, and central rein- 
forcement, 571 
Psychotic behavior, drug effects, 564-566 
Psychotomimetics (see Hallucinogens) 


164—169, 
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Psychotropic drugs, 545 
Puffer fish poison, 166 
Punisher, definition, 176-177 
Punishment, 174-198 
adventitious (see Adventitious punish- 
ment) 
animal psychophysics, 523-524 
and aversive control, 425, 427-430 
change over in matching, 243-244, 253 
and choice on concurrent schedules, 
205-204 
and choice of reinforcer delay, 252 
and conditioned reinforcement, 
322, 394-395, 396 
and conditioned suppression, 357-358 
contrast experiments, 74, 76, 77 
criteria for, 186-188 
definitions, 176-177, 182, 186 
drug effects, 558-560 
nena relevance and, 113, 115. 116- 


320, 


rerbils, 115 
ifistrumental behavior, 98, 101-108 
ae Pee and pattern of behavior, 


in Gbserving precedure, 320-325 

pigeons, 65, 953-954 

problems in testing for, 102 

as process, 176-177, 189, 186 

reseyery from, 194 

a to reinforcement, 175-180, 129, 
19 


as reproducible process, 176-177, 189, 
186 


Siamese fighting fish, 116 
Pure operant sclf-stimulation of brain, 


Pure stimulus act, 30 

Purposive acts, 12-13, 99 

Putamen, self-stimulation, 580 

Pyriform cortex, self-stimulation, 389 
Pyrogens, and thermoregulation, 165-166 


Quaalude (see Methaqualone) 
Quail: 

autoshaping, 58-59 

sexual behavior, 59 

wayelength generalization, 486 
Quantitative analysis: 

discrimination learning, 448-451 

law of effect, 4. 139, 146, 949-999 
Quantitative comparison, drugs, 547 
Outet biting attack, in cat, id 
Quinine: 

excretion, 330 

and thermorepulation, 155, 167 


Rabbit: 
classical conditioning, 
quency, 489, 494 
eyelid conditioning, 493-496 
pancreatectomized, 10 
self-stimulation of brain, 589 
thermoregulation, 159, 166-167 
Raccoon, misbehavior, 13 
Rage, in cat, 14 
Random-interval schedule: 
conditioned suppression, 354 
definition, 202 
induced drinking, 135 
shock avoidance, 267-268 
temporal patterning, 213, 265 
Random order of stimuli, animal psycho- 
physics, 521, 524 
Random-ratio schedule: 
definition, 202 
and positive conditioned suppression, 
359 
temporal patterning, 213 


auditory fre 
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Random schedule: 
definition, 202 
shock frequency reduction, 374-375, 
380-381 
Random-time schedule, definition, 202 
Rat: 
absolute response rate, 258-259 
adventitious reinforcement and punish- 
ment, 184 
ageressive behavior, 136 
alcohol reinforcement, 553, 561 
animal psychophysics, 518 
automaintenance, 65 
autonomic conditioning, 607-611 
autoshaping, 58, 60, 81 
avoidance, 116-117, 165, 260, 267-271, 
561 
behavioral contrast, 73-74, 78, 81-86, 
267-268, 270-271 
brightness discrimination, 526-527 
caloric regulation, 43 
central reinforcement, 572-590 
COD in matching, 243 
concurrent matching, 237, 248-250, 278 
conditional enhancement, 89 
conditioned suppression, 89-90, 341-359 
contingent-response experiments, 102- 
109 
delay of reinforcement, 260-262, 281 
development of infant, 18 
discrimination training, 501-502 
diurnal eating and drinking patterns, 
33, 36 
drug effects, 188-191, 349, 355-357, 544, 
546, 548-549, 551, 554-556, 560, 561 
eating rate, 41 
electrical brain stimulation, 14, 60, 120, 
178-180, 237, 259, 279, 518, 572-590 
equal-brightness contour, 526-527 
errorless learning, 467 
escape, 117, 261-263, 281-282, 319, 367- 
368, 383-387 
exploratory behavior, 60, 120 
extinction, 378, 380 
FR behavior, 36-38, 551 
FR FI performance, 187 
gastric ulcers, 597, 604 
grooming, 17 
heart rate in CER conditioning, 352 
heat reinforcement, 154-155, 188, 261, 
263, 282 
hypothalamic syndrome, 16-17 
insulin and sugar preference, 557 
interim activities, 126-139, 143, 147 
isobias functions, 535 
killing of, by cat, 14-15 
latent inhibition, 487 
lever contact response, 65, 81, 82, 120 
lever pressing (see Lever pressing) 
matching on concurrent schedules, 237, 
248-250, 278 
meal patterning, 28, 30, 34-38, 41-44 
mouse-killing, 17 
negative induction, 82 
negative reinforcement, 373-375, 378— 
384, 390-406 
nursing, 18-19 
obesity, 597 
olfactory discrimination, 523 
omission training, 65 
as omnivore, 47 
orientation response, 16-17 
patterning of drinking, 40 
pica, 137 
polydipsia, 129-139, 147, 561 
prandial drinking, 18, 129-131, 135, 252 
reinforcement immediacy, 260-262, 281 
reinforcement magnitude, 248-250, 258- 
261, 278-279 


relation of eating and drinking, 131 

relativity of reinforcement, 102-109, 181 

response-produced stimuli, 222 

running, 100, 103-107, 129-135, 175, 
181-182, 258-263, 278, 281-282 

Schedule-induced behavior, 126-139, 
143, 147 

schedule performance, 188-191 

self-stimulation of brain, 572-590 

shape perception, 529 

shock-correlated stimuli, 390-398 

shock-frequency reduction, 373-875, 
380-384, 402-406 

stimulus generalization, 439, 441, 493, 
497, 554-555 

sugar concentration, 259-260, 263, 2'79- 
280, 557 

swimming response, 261, 263, 282 

taste-aversion learning, 13-14, 24, 484 

thermoregulation, 153-156, 162-168, 
188-189, 556 

thigmotaxis, 117 

thyroidectomy, 18 

token reinforcement, 306 

tooth chattering, 17 

visceral conditioning, 607-611 

Rate: 

changeover, and matching, 234, 243-244, 
249-250, 254, 276 

eating (see Eating, rate) 

ingestion (see Eating, rate) 

local (see Local rate) 

members of reflex chain, 29 

pecking in pigeons, as constant, 238, 
246 


reinforcement 
and conditioned reinforcement, 314 
and conditioned suppression, 345, 355 
local, 214-218 
and matching, 233-244, 247, 249, 259 
and schedule-induced behavior, 130 
response 
absolute, 257-263 
and central reinforcement, 572, 574, 
576-578 
chained vs. tandem schedule, 293-295 
during COD, 244 
and conditioned 
348-351 
and FR requirement, 37 
and measure of reflex strength, 28 
property of behavior, 174-176 
and psychophysical scaling, 532 
role of interresponse time, 263-264 
schedule determinants, 228-229 
shock-delay procedures, 370-371 
shocks received, shock-delay _ pro- 
cedure, 371, 373 
Rate dependency, drug effects, 189-191, 
555, 558-560 
Rating method, animal psychophysics, 535 
Rayleigh distribution, in animal psycho- 
physics, 534 
Ratio schedule (see also Fixed ratio, Ran- 
dom ratio, Variable ratio): 
components of chain, 292-293, 296 
and concurrent choice, 246-247, 254 
definition, 202 
nol-regenerating property, 212 
shock delay, 377 
temporal patterning, 217, 219-220 
variable vs. fixed, responses per rein- 
forcer, 207 
Ratio strain, 37, 543 
Reaction time methods, animal psycho- 
physics, 526-527, 532 
Reactive inhibition, 146 
Rearing response: 
gerbil, 111 


suppression, 345, 
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hamster, 113, 118 
rat, 117-118 
Recall, and memory, 638 
Receiver operating characteristic, animal 
psychophysics, 533-537 
Receptive field model, shape discrimina- 
tion, 529 
Recognition, and memory, 638 
Recovery, stages of: 
compared to development, 17-22 
in hypothalamic dogs, 20 
in hypothalamic rats, 17 
Red nucleus, self-stimulation, 590 
Redundancy, stimuli in autoshaping, 56- 


Reference procedure, extinction, 377-380 
Referent, of words, 619 
Refinement experiment, 30-31 
Reflex: 
and aversive control, 425 
chain, in insects, 142 
chain, within meals, 28, 45-46 
conditioned, 9 
eating as, 29-31 
functional properties, 1'74 
gape, in thrushes, 19-20 
grasp, in humans, 19 
laws of, 31 
and minimal unit, 67 
Sherrington’s definition, 29 
sleep, 22 
as source of behavior, 67 
spinal, 8-9, 11 
strength, 29-30 
thermoregulatory (see Thermoregula- 
tion, reflex) 
as unit of analysis, 29-31, 49, 53 
as unmotivated, 9, 13, 23 
variables governing, 9 
Reflex arc: 
and language, 641 
model of learning, 53 
Reflex reserve, 11, 205 
Refractory phase, of reflex, 29 
Regenerating property, interval and time 
schedules, 212, 229 
Regression toward the mean, responses 
per reinforcer, interval schedules, 
212 
Regulation, temperature (see Thermoreg- 
ulation) 
Regulatory mechanisms, in polydipsia, 129 
Regulatory system: 
constraints on learning, 118-119, 122 
water, in rats, 118 
Reinforcement, 98-123, 174-198 (see also 
Conditioned reinforcement, Central 
reinforcement, Negative reinforce- 
ment) 
adventitious (see Adventitious reinforce- 
ment) 
analysis and history, 98-100 
central, 570-590 (see also Electrical 
brain stimulation) 
similarity to conventional, 574-581, 
584-585, 589 
choice, central vs. conventional, 577-578 
conditioned (see Conditioned reinforce- 
ment) 
contingency (see Contingency) 
in contingent-response experiments, 98- 


criteria for, 186-188 
definitions, 175, 186, 202, 433, 584 
delay (see Delay of reinforcement) 
density: 
in autoshaping, 61 
and conditioned reinforcement, 314, 
316, 325-327, 329, 332, 336-337 
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on differentiation schedule, 224 
and drugs, 552 
and induced behavior, 140 
in time, 214-215 
depletion-repletion model, 31-32 
by drugs, 192-193, 556-558 
electrical brain stimulation (see also 
Reinforcement, central; Electrical 
brain stimulation) 
magnitude, 259 
and matching, 252 
as effect of operation, 202 
excitatory effects, 264, 272-974 
food vs. water, and matching, 252 
free, in contrast experiments, 87-90 
peers conditioned suppression, 
and contrast, 76-78, 84, 87 
and induced aggression, 469, 475 
and IRT reinforcement, 225 


and matching, 235-244, 947, 940, 959 


and peak shift, 462 
functionally defined, 4 
immediacy (see also Delay of reinforce- 
ment 
and central reinforcement, 575 
and matching, 233, 251-252 
and response strength, 260-262, 266, 
281 


inhibitory effects, 964, 979-974 
interchangeability, 13-14, 23, 112-113, 
129 


intermittent (see also Intermittent rein- 
forcement, Partial reinforcement) 
and drugs, 558 
aud partial reinforcement effect, 201 
and thermoregulation, 164 
of IRTs, and schedule performance, 
225, 225 
magnitude: 
a response rate, 257-259, 
conditioned reinforcement, 320, 329, 
326-327 
and contrast, 86, 267 
drugs, 192-193, 250-251, 558 
electrical brain stimulation, 179. 181, 
259, 279 
incentive, 100 
and induced behavier, 182-133 
and IRT reinforcement, 295 
an peannNe: 200-244, 216251, 2A2. 


with meal reinforcement, 45 
and observing responses, 920, 522 
and polydipsia, 180-132 
thermal reinforcement, 157, 164 
model of, central and conventional, 
584-585 
negative (see Negative reinforccment, 
Escape, Avoidance 
neural substrate, 571, 574, 580-581 
nondifferential: 
definition, 433 
and generalization gradient, 436 
noradrenergic hypothesis, 571 
omission, as second-order schedule, 301 
as operation, 176 
partial (see Partial reinforcement, Inter- 
mittent reinforcement) 
percentage, and FI schedule, 301 
Premack’s theory, 101-106, 108-109 
as process, 176-177, 182 
quality: 
and absolute response rate, 259-260 
and choice, 252, 262 
re-evaluation of concept, 4 
relation to extinction, 205 
relation to punishment, 175-182, 197 


relative frequency, and matching, 233— 
244, 247, 249, 252 
as reproducible process, 176-177, 182, 
186 
resistance to, and stimulus generaliza- 
tion, 446, 450 
schedules (see Schedules) 
secondary (see Secondary reinforcement, 
Conditioned reinforcement) 
shock-frequency reduction, 365, 373-378, 
382, 387, 392, 400 
shock delay, 402-406 
strength, central reinforcement, 572-574 
systems of, 580-582 
of temporal pattern, 643 
token, 306 
traditional theory, 99 
of verbal behavior, 637 
Reinforcements per opportunity (see 
Probability, reinforcement) 
Reinforcement theory: 
and mathematical analysis, 4 
Premack’s, 101-109 
traditional, 99-101 
Reintorcer: 
definitions, 2, 175-177, 202 
examples, 178 
temporal placement, 213-214 
Reinforcibility (see Conditionable _ re- 
sponse unit, Conditionability, Asso- 
ciabilit 
Relations, and control of verbal behavior, 
638-639 
Relative generalization gradient: 
description, 435 
and schedule control, 436, 438 
and stimulus control, 501-509 
Relative proximity principle, 221 
Relative proximity rule: 
as equilibrium principle, 140 
and induced behavior, 140-141 
Relative reciprocal, and IRT reinforce- 
ment, 225 
Relativity: 
conditioned suppression, 349-351 
inhibition, 462 
reward vs. punishment, 101-110, 178- 


stimulus control, 496 
thermal preference, 169-165 
time within iniervals, ??0=?71 


Releasing stimuli: 


im attack behavior, 15, 22 
ethological concept, 410 
learned, in auloshaping, 69, 70 
REM sleep. ontogeny. 22 
Repletion (see Depletion 
Replication, in behavioral pharmacology, 
542 
Reproducibility, of behavioral baseline, 
and pharmacology, 551 
Reproducible behavioral processes, 174- 
176, 189, 186-188, 197 
Reproductive state, and thermoregula- 
tion, 161-162 
Reptiles, thermoregulation, 156-158, 161- 
162,-166 
Reserpine: 
classification, 545 
and conditioned “uppression, 355-356 
psychiatric use, 540 
and schedule performance, 189 
and self-stimulation of brain, 585 
Resistance to extinction: 
drug reinforcement, 558 
and generalization, 436, 445, 450, 483 
and partial reinforcement effect, 201 
Pavlovian conditioning, 61 
response unit hypothesis, 225-226 
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and stimulus control, 436, 445, 450, 482, 
509 
Resistance to reinforcement, and stimulus 
generalization, 446, 450-451 
Resistance-to-reinforcement test for in- 
hibitory control, 470-472 
Resource allocation, in feeding patterns, 
47 
Respondent (see also Classical, Pavlovian): 
elicited by brain stimulation, 573, 580 
Respondent behavior: 
and conditioned suppression, 941, 349, 
348-352 
as interfering response in conditioned 
suppression, 351-353, 358-359, 360 
Respondent conditioning, 3,9,58 
Response: 
differences, and negative reinforcement, 
407-408 
interchangeability, 18, 91, 119-118, 199 
types, in operant conditioning, 183 
Response contingency, and schedule-in- 
duced behavior, 125-128, 130-136, 
140-141 
Response cost: 
and alcohol consumption, 563 
in human matching, 239 
Response dependency: 
as stereotypic effect, 204 
and temporal contiguity, 228 
Response-dependent schedule, 204 (see 
also Specific schedule types) 
Response deprivation hypothesis, 103-118, 
122-123 
Response enhancement, by electric shock, 
185, 194-195 
Response frequency, variables determin- 
ing, 206-215 
Response functions: 
observing responses, 322—424 
in polydipsia, 130 
in schedule-induced behavior, 139 
Response-independent food. in response- 
dependent schedule, 128, 138 
Response-independant schadulas, 196-198, 
132-133, 141, 204 (see aise Fixed 
time, Random time, Time schedu c, 
Variable time) 
Response-indspendent shock, and aversive 
contrel, 415-495 
Response-initiated schedule, fixed inter- 
Ya 
Recsponce number: 
as discriminative stimulus, 222 
Sn PI aehadilas, 866-808 
on FR schedules, 209 
Response-pacing procedure, O48, OFA 
Response patterning: 
interval and time schedules, 214 
ratio schedules, 217-826 
temporal organization, 913-991 
Response-produced shock (sé€é€ Electric 
shock, response produced) 
Response-produced stimuli: 
and chaining hypothesis, 291-999 
and conditioned reinforcement, 
399-323 


319, 


. Response rate (see Rate, response) 


Response-reinforcer relation: 
in automaintenance, 63-71, 91 
as temporal, 204 
Response-shock interval: 
description, 369-370 
ratio shock-delay schedule, 377 
relation to shock-shock interval, 370- 
371, 388 
shock-delay procedure, 369-371, 388 
shock-deletion procedure, 372 
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Response-shock interval (cont. 
in Sidman avoidance, 369, 370-371 
standardized 20 sec, 599 
Responses per reinforcement: 
differentiation schedules, 224 
fixed-interval schedules, 205-213, 229 
and interreinforcer time, 211-213 
ratio schedules, 207-214 
and schedule performance, 228-229 
Response strength: 
and conditioned suppression, 348-351, 
360 
equations for, 257, 264-267 
and meal termination, 45 
measures, 258, 262 
and negative reinforcement, 260-262 
reinforcement parameters, 234, 239, 
257-263, 278 
theories, 263-264 
Response suppression: 
in behavioral contrast, 76-78, 84 
in contingent-response situation, 103, 
105-107, 112 
by electric shock, 183, 185, 193, 195 
and forced running, 103 
Response unit, specification of types, 222- 
225 


Restraint: 
in animal psychophysics, 523, 527 
and aversive control, 421 
and temporal discrimination, 140-143, 
147 
Retardates, human: 
brain stimulation, 581 
chlorpromazine, 564 
Reticular formation, EEG activity, 605 
Retrieval cues, in verbal behavior, 637 
Reversal, discrimination: 
and errorless learning, 470 
and stimulus control, 457, 470 
Reversibility: 
drug effects, 551 
interactions in multiple schedules, 72 
Reversing chains schedule, and avoidance, 
366 
Reward (see Reinforcement) 
Ye, 99-101 
Rhesus monkey: 
alcohol ingestion, 597 
autoshaping, 58 
avoidance, 185, 366, 599-601, 604 
blood pressure conditioning, 609 
conditioned suppression, 352, 598-599 
drug dependence, 179, 553 
drug-maintained responding, 192 
endocrine changes in avoidance, 599- 
601, 604 
epinephrine and stimulus control, 556 
hormones and avoidance, 599-601, 604 
hypertension and avoidance, 601 
morphine dependence, 179, 553 
peptic ulcers, 597, 605 
shock avoidance, 185, 366, 599-601, 609 
shock-elicited attack, 380 
shock-frequency reduction, 376-377, 380 
shock postponement, 185 
sugar concentration 
and response strength, 260 
Rhodopsin, experimental synthesis, 10, 22 
Ribonucleic acid, rat brain, 606 
Ritalin (see Methylphenidate) 
RNA (see Ribonucleic acid) 
ROC (see Receiver operating character- 
istic) 
Rooting response: 
pig, 13 
pigeon, 65 
Rough grain, on chained schedules, 291 
Routes, drug administration, 543, 545 


RS interval (see Response-shock interval) 
Ruminants, 33, 34, 47 


Running: 
in contingent-response experiments, 
101-109, 181 


as emitted, 175 
as facultative behavior, 133-137, 140, 
146 
and food rate, 133-134 
forced, 102-103, 181 
interaction with induced drinking, 146- 
148 
pattern, effects of schedule, 182 
rat: 
fixed-time schedules, 126, 133-134, 147 
immediacy of reinforcement, 260, 281 
magnitude of reinforcement, 259, 263, 
278, 280 
negative reinforcement, 407 
shock escape, 261-262, 281-282 
temporal pattern, 175, 182 
reinforcer vs. punisher, 181 
and schedule induction, 129, 133-135, 
146-147 
Running wheel: 
avoidance, 116-118 
gerbils, 98, 114, 115, 118 
motorized, 102, 111 
rats, 102-108, 146, 181-182 
Run time distribution, fixed-ratio sched- 
ule, 228 
Runway: 
central reinforcement in, 572 
contrast effects, 86 
reinforcement magnitude, 259-261, 278- 
282 
shock escape, 261-262, 281-282 


Saccharine, in contingent-response experi- 
ments, 106, 108 
Saccharine-glucose solution, preference by 
rats, 578 
Salience, stimulus: 
and conditioned reinforcement, 317 
relation to validity, 496 
and stimulus control, 483, 492, 496, 500, 
506 
Salience, warning stimulus, and shock 
postponement, 394-395 
Saline, control in psychopharmacology, 
548, 554, 556 
Salivary conditioning: 
classical, 9, 53, 61, 138, 340, 489, 493 
description, 340 
lesions and, 20 
operant, 607 
Salivation: 
hamsters, and grooming, 114 
instrumental conditioning, 607 
and thermoregulation, 165 
Sample size, behavioral pharmacology, 
541-542, 551 
Satiation: 
central reinforcement, 576-578, 580, 589 
thermoregulation, 164 
Satisfier, in law of effect, 99, 101 
Scalar property, timing on response-delay 
procedure, 400 
Scaling, animal psychophysics, 530-532 
Scallop, fixed-interval, 140, 143, 384, 458, 
643 
Schedule complex, and negative reinforce- 
ment, 384 
Schedule control, 201-229, 288-309 
chained schedules, 289-299 
and conditioned suppression, 345-351, 
360 
drug effects, 188-191, 202, 542-543, 546, 
551-552, 555-560 


Subject Index 


electric shock, 193-196 
and generalization gradient, 436, 438- 
439 


induced behavior, 129-138 

and inhibitory control, 458-466 

negative reinforcement, 367, 375-378, 
384-386, 390, 395-398 


and positive conditioned suppression, 


property of operant behavior, 197 
as i am process, 176-177, 186, 
19 
Schedule-induced 
Aggression) 
Schedule-induced behavior, 125-148 (see 
also Interim activities, Polydipsia, 
Attack, Drinking) 
definition, 126 
hypotheses, 130-132 
measurement, 129-130 
motivation, 128-132, 138-139, 144 
pigeon, 126-127 
rat, 126-127, 130 
regulatory mechanisms, 129 
and schedule variables, 129-138 
temporal and sequential structure, 135, 
140-148 
types, 127 
Schedule-induced drinking (see 
dipsia, Drinking) 
Schedule-induced running (see Running) 
Schedule performance, 125, 140, 176, 186- 
188, 197, 265 
Schedule performance: 
chained schedules, 288-299 
direct and indirect variables, 204, 228- 
229 
drug effects, 188-191, 202 
and electric shock, 193-196 
interval vs. ratio, and matching, 255 
as multiply determined, 228 
and reinforced IRTs, 223-225 
second-order schedules, 299-306 
theories, 217-221 
Schedules of reinforcement, 201-229 (see 
also Schedule control, specific sched- 
ules) : 
biological significance, 48 
characteristic effects, 176-177, 181, 197, 
201, 213 
and extradimensional training, 503-505 
as fundamental determinants, 201-202, 
229 
history, 201-202 
types, 202-203 
Schizophrenia, 571 
Scopolamine: 
classification, 545 
clock schedule, 555 
and learning, 561 
and stimulus control, 555 
Scrabbling response, hamster, 113-114, 118 
SA: 
on chained schedule, 296 
in shock-frequency reduction, 381 
SA periods, and induced behavior, 132, 
135-136, 141 
Seal, animal psychophysics, 519 
Sea lion, visual acuity, 519-520 
Search time, 48 
Secobarbital, 544-545 
Seconal (see Secobarbital) 
Secondary reinforcement (see also Con- 
ditioned reinforcement) 
in autoshaping, 56 
and brain stimulation, 576, 589 
Second-order deviations, on fixed-interval 
schedules, 208-209 


attack (see Attack, 


Poly- 


Subject Index 


Second-order schedule: 
brief stimuli, 299-306, 313, 315-318 
description, 299 
discrimination of components, 316-318 
drug effects, 193 
electric shock effects, 193 
induced drinking, 135 
interval components, 300-301, 303, 305 
and pairing hypothesis, 315-318 
ratio components, 301-302 
and response units, 996 
and tandem schedule, 294, 300, 301 
Seizure activity, and central reinforce: 
ment, 573 
Selection, terminal response, 127, 133 
Selective attention, 507-510 
Self-administration, drugs, 
556-558, 561, 563, 586 
Self-regulation procedure, 
forcement, 574 
Self-selection, balanced diet, 43, 44, 47-49 
Self-stimulation of brain (see Electrical 
brain stimulation, Central rein- 
forcement) 
Semantic, in language structure, 629, 632- 
633 


192-193, 543, 


central rein- 


Semantic memory, 637-638 
Semantics, natural language, 622-624 
Sensation, drug effects on, 562 
Sensitivity index, animal psychophysics, 
533 
Sensory control: 
eating, 20 
transformation, 19-20 
Sensory deficit, and 
155—156 
Sensory fields, 14-16, 21 
Sensory neglect, 16— 17 
Sensory scanning, and aversive control, 
418-450 


thermoregulation, 


Sensory stimuli, and central reinforce- 
ment, 582-583, 586, 588-589 
Sensory thresholds, measurement in 
animals, 515-525 
Septum, stimulation, 570, 575-576, 583 
Sequence: 
behavior, and shaping, 175-178, 180 
behavioral states, 144-148 
as conditionable. response unit, 226-227 
induced activities, 126-127, 140-143 
response, during errorless learning, AG64A— 
465 
stimuli: 
on chained schedule, ?95=_296, 334=336 
and stimulus control, 456-457 
as theoretical response unit, 225-226 
Sequential dependency, in animal psy- 
chophysics, 
Sequential interaction: 
behavior and environment, 7, 
197 
among behaviors, 144-145 
Sequential organization, responses between 
reinforcers, 271 
peavey relations, responding on FI, 
0 


180, 


Sequential structure, induced behavior, 
140-146 


Serial order, problem in behavior, 641- 
642 


Serotonin, and thermoregulation, 167, 169 
Set point: 
and control theory, 154, 160 
definitions, 154, 160 
in thermoregulation, 160-169 
“Sets,” 9 
17-hydroxycorticosteroid: 
and avoidance, 599-601, 604 
and conditioned suppression, 598-599 


17-OH-CS (see 17-hydroxycorticosteroid) 
Sex hormones: 
and avoidance, 599-601 
and self-stimulation of brain, 579, 589 
and thermoregulation, 162 
Sexual behavior: 
and electrical brain stimulation, 579, 
581 
guinea pigs, 91 
pigeons: 
autoshaping, 59, 119 
as interim activity, 137 
quail, Pavlovian conditioning, 59 
rat, and thermoregulation, 162 
Shape perception, animal psychophysics, 
529 


Shaping: 
animal psychophysics, 520 
and continuity in time, 177-178 
heart rate conditioning, 607 
in language learning, 624 
with negative reinforcement, 406-407 
by response-produced shock, 190 
by successive approximations, 93, 54, 69, 
177, 182, 400 


unnecessary in nondeprived animals, $7, 


43 

Shaping schedule, 224 

Sharpening, generalization gradient, 454, 
494-495 


Sheep, latent inhibition, 487 
Shivering, and thermoregulation, 154-155, 
164-168 
Shock (see Electric shock) 
Shock delay (see also Avoidance) 
as reinforcer, 407-406 
vs. shock-frequency reduction, 402-406 
Shock-delay procedure: 
cues added, 392-398. 400 
description, 365, 370-371 
extinction after, 378-380 
multiple schedule, 383 
multiple response patterns, 388 
negative reinforcement, 365, 370-371, 
380, 383, 388-391, 400, 402, 407, 409 
ratio schedules, 377 
response bursting, 388 
and stimulus generalization, 450 
temporal discrimination, 399-401 
Shock-deletion procedure (seé aise Avoid- 
ance}: 
description, 305, 371-372 
negative reinforcement, 569, 371-375, 
375, B82, 384, 3RG, 887, $98_399, 400_ 
401, 493 
temporal discrimination, 400-101 
ohock-density reduction, 373 
Shock-elicited behavior: 
human, 417=420 
mouse, 416 
and negative reinforcement, 580, 408- 


409 
squirrel monkey, 417-480 
Shock-free periods, concurrent schedules, 


390, 392 

Shock-frequency reduction (see also 
Avoidance, Negative reinforce- 
ment): 


as controlling variable, 365, 373-375, 
direct manipulation, 374-375, 380-381 
Herrnstein’s equations, 373-374, 386— 
387 
us. shock delay, 402-406 
Shock-intensity reduction, 375, 380 
Shock postponement (see Shock delay) 
Shock-shock interval: 
description, 369-370 
fixed vs. variable, 397 
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relation to response-shock interval, 370- 
371, 388 
shock-delay procedure, 369-370, 388 
shock-deletion procedure, 372, 384-385 
Sidman avoidance, 369-370 
standardized 2 sec, 599 
typical values, 370 
variable, 372 
Shuttlebox: 
central reinforcement, 572, 574, 582 
escape by pigeons, 254 
and heart rate conditioning, 608 
Shuttle response: 
negative reinforcement: 
pigeons, 407 
rats, 400, 406 
shock delay, dogs, 393 
Shuttling, and thermoregulation, 157-158, 
164, 166 
Siamese fighting fish: 
attack behavior, 15, 59, 65 
automaintenance, 65 
Pavlovian conditioning, 59, 63 
punishment, 116 
Sign stimuli, 15, 23 
Sion tracking: 
ea napes and automaintenance, 69, 
discrimination learning, 69 
Signal detection theory, animal psycho- 
physics, 516, 532=537 
Signaled shock, and negative reinforce- 


ment, 397-398 


Similarity; 
perceptual, in animal psychophysics, 
530-532 a 
S+ and S—, and errorless learning, 465- 
466 


Simultaneous discrimination, transfer to 
SUCCESSIVE, 
Simultaneous method, generalization gra- 
dient, 435 
Single-process model of conditioning, 99 
Single es methods, animal psycho- 
hysics, 15-517 
Single.cstimulus method, generalization 
testing, 434 
Situational stimulus (see also Contextual 
stimulus): 
and stimulus control, 494-309 
6-hydroxydopamine (cee also Dopamine): 
lesions made with, 586 
anel schizo hrensa, 


ara Pavlovian conditioning, 


Skeletal system, and conditioned up: 
pression, 995-555 
Slunner box, 2, 11, 39, 572 
Skin temperature; and thermerceulation,; 
154, 156, 1698-164 
Sleep: 
ontogeny, BY—B5 
slow-wave, 29 
Stages, 22 
Smell (see Olfaction) 
Smooth curves, as criterion for laws, 11- 
12, 29-30 
Snake, thermoregulation, 162 
Snuggling response, chicks, 60, 121 
Social behavior, and feeding behayior, 49 
Sodium chloride, excretion, 550 
Sodium light, ducklings reared in, 485-486 
Sodium salicylate, and thermoregulation, 
167 
Solubility, drugs, 549-550 
Somatic effects, chronic, behaviorally-in- 
duced, 597-598 
Somnolence, and hypothalamic lesions, 
156 
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Somnos (see Chloral hydrate) 

Sopor (see Methaqualone) 

Spaced responding, shock-delay procedure, 
400 


Spaced responding schedule (see DRL 
schedule) 
Species differences: 
acquisition with negative reinforcement, 
406-409 
animal psychophysics, 519-520 
associability of responses and_ rein- 
forcers, 484-485 
automaintenance, 65-66 
autoshaping, 58, 60, 70, 91 
COD in matching, 243 
conditioned enhancement, 90-91 
conditioned suppression, 346, 359 
contrast, 73-74, 78, 81-82, 91 
feeding patterns, 46-49, 58 
generalization gradients, 435-436 
negative reinforcement, 390 
positive conditioned suppression, 359 
schedule control, 186 
stimulus control: 
tone frequency, 441 
wavelength, 483 
thermoregulation, 153, 156, 167-169 
Species-specific behavior: 
aggression in pigeon, 468 
and aversive control, 425 
central reinforcement, 580-581 
constraints on learning, 3, 112-118, 120- 
12] 
contrast, 273, 275 
and negative reinforcement, 406-409 
and operant conditioning, 183 
pecking in pigeon, 273 
Species-specific defense reaction, 116-118, 
120, 165, 407 
Species-typical behavior: 
and brain stimulation, 571-572 
constraints on learning, 112-118, 120- 
12] 
and negative reinforcement, 408, 410 
Specificity: 
autonomic conditioned response, 608, 
610 
drug action, 551-552 
Spectral sensitivity, goldfish, 517 
Spindle activity, 605 
Spiral aftereffect, monkey, 529 
Sprawling, and thermoregulation, 154, 165 
Squirrel: 
animal psychophysics, 519 
self-stimulation of brain, 589-590 
Squirrel monkey: 
adjusting schedule, 212-213 
attack behavior, 23, 136, 417-430 
automaintenance, 65-66 
autoshaping, 60 
avoidance, 185, 193-196 
blood pressure conditioning, 609 
color vision, 518 
conditioned reinforcement, 304-305 
conditioned suppression, 89 
drug effects, 189-191, 557 
electric shock: 
conditioned reinforcement, 304-305 
reinforcer vs. punisher, 178-179, 183- 
185, 193-196 
environmental thermoregulation, 41 
FR FI performance, 187 
hypertension and avoidance, 601 
negative reinforcement, 367, 380, 385- 
386 
omission training, 65 
schedule-induced attack, 136, 417-430 
schedule-induced drinking, 129, 138 
shock-elicited responses, 183-184, 380 


shock escape, 189, 380, 557 
shock-induced attack, 136, 417-430 
thermoregulation, 41, 153, 156 
S-R relations, 11-13, 24 
SS interval (see Shock-shock interval) 
SSDR (see Species-specific defense reac- 
tion) 
Staircase method, animal psychophysics, 
521-522 
Stalking: 
carnivores, as DRL schedule, 48 
cat, 15 
State (seé Behavioral state, Interim state, 
Terminal state) 
State-space, and motivation, 144 
Steady states, and schedule control, 227— 
228 
Steepening, generalization gradient: 
and discrimination training, 439-453 
in extinction, 438, 483 
Steepness, generalization gradients, and 
peak shift, 448-449, 454 
Stelazine (see Trifluoperazine) 
Stereotaxic procedure, implantation of 
electrodes, 570, 572 
Stereotypic effects: 
and reinforcement, 204 
and schedule performance, 204-205, 228 


Stereotypy: 
activities on fixed-time schedules, 126, 

204 
collateral behavior, and conditioned 


suppression, 347 
fixed-ratio performance, 227 
interim activities, 137 
Operant responses, 177 
superstitious behavior, 127, 204 
Steroids, and avoidance, 599-601, 604 
Sterols, excretion, 550 
Stickleback: 
ageression, 121-122 
sexual behavior, 15-16, 121-122 
Stimulants: 
Classification, 545 
self-administration, 543 
and stimulus control, 555 
Stimuli: 
interchangeability, 13, 23 
relations among, in control of behavior, 
30 
Stimulus change: 
cue in escape, 386 
and maintenance of observing, 321 
Stimulus compounding, and generaliza- 
tion, 459-463 
Stimulus context, and stimulus control, 
457-458 
Stimulus contingencies, and terminal re- 
sponse, 128 
Stimulus control, 432-476, 481-510 
Stimulus control: 
acquired distinctiveness of cues, 487- 
488 
acquisition, conditions affecting, 483-486 
and acquisition speed, 454-456 
by air flow rate, 491 
and amount of training, 453-456 
attentional factors, 481-510 
auditory, 483-495, 528 
aversive baselines, 450, 455-456 
and avoidance, 366, 369 
by behavior, 221-292 
by brightness, 490 
and central reinforcement, 582-583 
as continuum, 433 
definitions, 433, 482 
dimensional, and early experience, 485- 
487 
drug effects, 554 
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by drugs, 192-193, 555-556 
dynamic models, 451-453 
and EEG activity, 605-606 
by elapsed time, 218-219 
and errorless discrimination, 186 
examples, 481 
experimental procedures, 488-505 
heart rate conditioning, 607-608 
inhibitory, 432-476 
and amount of training, 453-456 
concurrent schedules, 463-464 
determinants, 432, 453-464, 476 
and errorless learning, 470-474 
measurement, 445-447 
on multiple schedules, 458-463 
and schedule variables, 458-466 
by interresponse time, 223 
of interresponse times, and generaliza- 
tion, 437-438 
and intradimension] training, 493-494 
Lashley and Wade’s theory, 483, 485- 
486 
by light, and early experience, 485-486 
by line orientation, 440-441, 445, 454— 
455, 458, 482, 494, 501-502, 527 
and masking, 482-483, 491-499, 500, 502, 
506 
Measurement, 433-436, 482-483 
multiple schedule, generalization, 458- 
463 
nature of response and reinforcer, 483- 
484 
and negative reinforcement, 366, 369, 
381-396, 399 
by an odorant, 517 
overshadowing, 492-493, 496, 499-501, 
503, 506 
and prior experience, 485-488 
resistance to extinction, 436, 445, 450, 
509 
schedule-induced behavior, 
140 
signal detection analysis, 536 
theory, 505-510 
and threshold tracking, 522 
by tone frequency, 440-441, 450, 455- 
456 
verbal behavior, 637-640 
by wavelength, 434-439, 441-443, 454, 
485-486 
by wavelength difference, 525, 531, 534 
Stimulus generalization (see also Generali- 
zation): 
aversive control, 450 
brightness, 490, 493 
compound stimulus, 439 
and conditioned reinforcement, 316-318 
conditioned suppression, 351 
diffuse stimuli, 490, 508 
discrimination learning, 447-451 
discrimination training, 498-505 
after errorless learning, 471-472 
extradimensional training, 497-505 
line orientation, 440-441, 445, 454-455, 
458, 482, 494, 501-502, 527 
monkey, 490 
and Pavlovian conditioning, 481, 483- 
484, 487, 489, 492, 493 
Pavlovian theory, 483 
and shock avoidance, 395 
tone frequency, 440-441, 450, 455-456, 
493 


129. 131, 


human, 493 
rabbit, 493 
wavelength 
ducklings, 485-486 
pigeons, 434-439, 441-443, 454, 489 
Stimulus generalization gradient: 
absolute vs. relative, 435 


Subject Index 


as artifact, 438-439 
compounding theory, 459-463 
as continuous, 438 
definition, 433 
determinants, 432, 436-439 
fixed-interval schedule, 458 
inhibitory, 445-447, 453-456, 459-463 
interdimensional training, 494-497 
maintained, 434, 435, 529-530 
measurement, 433-436 
microstructure, 437-439 
necessary and sufficient conditions, 483 
peak shift, 441-443 
Spence’s theory, 447-451 
techniques for obtaining, 433-436 
transient methods, 434 
Stimulus presentation methods, animal 
psychophysics, 520-525 
Stimulus-reduction procedure, and gen- 
eralization gradient, 447 
Stimulus-reinforcer relations: 
in automaintenance, 63-67, 70-71 
autoshaping, 54-56, 70-71, 80 
behavioral contrast, 80-84, 86—88 
in operant procedures, 71-75, 91 
Stimulus set, and generalization, 443-444 
Stimulus substitution: 
autoshaping, 62, 70 
Pavlovian conditioning, 61-62 
Stimulus surrogation, autoshaping, 62, 70 
Stimulus variables, psychopharmacology. 
554-556 
Stochastic process: 
in behavior sequences, 142, 146 
and temporal discrimination, 141-142 
Stochastic transitivity, and choice, 322-333 
Strain, on ratio schedules, 37, 543 
Strategies, feeding behavior, 38-39, 46-49 
Stress, effects of, 5, 507 
Strict learning procedure, linguistic struc- 
ture, 624-626 
Strict training procedures, language learn- 
ing, 624, 640 
Stroke, recovery from, 19 
Structural analysis, language, 3, 629-633, 
6492-651 
Structure: 
induced behavior, 140-146 
language, 622-633. 642-651 
Subduction, of behavioral state, 145 
Substantia nigra: 
lesions, 582 
celf-stimulation, 585, 590 
Success, as method of synthesis, 12 
Successive approximations, method of 
shaping, 54, 62 
Successive discrimination! 
multiple schedules, 71 
transfer to simultaneous, 487 
Successive stage experiments, extradimen- 
sional training, 497-500 
Sucrose: 
concentration, and response strength, 
259-260, 263, 279, 280 
pellets, and response strength, 278 
solution, magnitude of reinforcement, 
248-249, 259-260 
Sugar (see Sucrose) 
Summation, stimulus, 443-444, 459 
Summation method, and generalization 
eradient, 446-447 
Sunflower seed, 98, 114 
Superego, 9 
Superim position 
learning, 466 
Superstitions, concurrent, 234, 235 
Superstitious behavior (see also Adven- 
titious reinforcement): 
Skinner’s view, 127 


procedure,  errorless 


“Superstitious” responding, 13, 127, 132, 
204, 359 
Suppression: 
behavior, prior to shock, 422, 425 
concurrent response, by punishment, 
253 
conditioned (see Conditioned suppres- 
sion) 
interim activities, by prevention, 140- 
143, 147 
response: 
in behavioral contrast, 76 
by free food, 128 
by S— on chained schedule, 297 
by shock, 182, 185, 193, 195 
running, by terminal and 
activity, 133 
terminal response by interim, 143 
Suppression ratio: 
calculation, 342 
problems with, 342, 345, 348-351 
Suppressive summation, and _ stimulus 
generalization, 443-444, 459 
Supraliminal stimuli, animal psychophysics 


’ , 
interim 


of, 525-532 

Surface structure, language, 6290, 631-633, 
648-649 

Surrogate mothers, and thermoregulation, 
159 

Surrogation, stimulus (see Stimulus sur- 
rogation) 

Sweating, and thermoregulation, 154, 165, 
167 

Swimming, escape response in rats, 961, 
263, 282 


Switching response (see also Changeover) 
Switching response, concurrent schedules, 
254-255 
Syndrome: 
lateral hypothalamic, 16-16 
morphine withdrawal, 179, 554 
Syntactic behavior: 
generative, 639-640 
theories, 640-642 
Syntactic structures, 628 
Syntactic system, definition, 620 
Syntax: 
acquisition, 619-627, 643-647 
in chimpanzee, 640 
cotiponent of language structure, 6290_ 
632 
functional a proach, 630-640, 643-647 
Syntax crystal, 622=627. 647 
Synthesis, experimental methods, 10-11 
Synthetic VI schedule, 263-204 
Systems, reinforcement, ®86-582 
Systems constraint hypothesis, 
122 


118=119, 


“Tables of discovery,” 12 
Tact, verbal behavior, 634-635 
Tactual discrimination, monkeys, 517 
Tandem schedule: 
absolute response rate, 258 
and chained schedule, 290, 293-295, 298 
description, 290, 315 
drug effects, 542-543, 555 
fixed-interval and fixed-ratio, 210-211, 
993 
and IRT as conditionable unit, 224 
and second-order schedule, 300-301, 316 
Taste: 
aversiveness, in rats, 527 
and central reinforcement, 582 
pathways, 586 
Taste-aversion learning, 7, 13-14, 24, 484 
Taxis, 158 
Taxonomy: 
behavior, 12, 24 
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behavioral interactions, 144-145 
TD (see True discrimination) 
tD — tA schedules, negative reinforcement, 
376-377, 399 
Tegmentum, single unit recording, 605 
Telencephalon, stimulation, 577 
‘Temperature regulation (see Thermo- 
regulation) 
Temporal contingency: 
classical conditioning, 343, 349-350 
and terminal responses, 127-128 
Temporal control: 
and conditioned reinforcement, 
317-818, 336-337 
interval schedules, 217 
language behavior, 643-647 
ratio schedules, 217 
‘Temporal discrimination: 
carbon monoxide effects, 563 
and conditioned suppression, 349_350, 
oe 
and interim activities, 140, 149-143 
and polydipsia, 132 
restraint effects, 140, 149-143 
and schedule control, 217-219 
and shock-delay, 399-401 
stochastic processes, 140-141 
and theory of patterning, 217-220 
Temporal factors, in conditioned rein- 
forcement, 314, 317-318, 326-337 
Temporal frame, in verbal behavior, 635 
‘Temporal integration, in cerebral cortex, 
641-642 
Temporal location, component of chain 
schedule, 289, 293 
Temporal pattern: 
in contingent-response experiments, 
107-108 
escape behavior, 425-427 
in preshock stimulus, 350, 352 
and proximity to reinforcement, 765 
punishment behavior, 427-480 
reproducible behaviors, 175-176 
responses: 
and brief stimuli, 302, 808 
chained schedules, 289-291 
second-order schedules, 300-302 
and schedule control, ?13=?15 
on schedules, theories, BY 7-29} 
shock -elicited behavior, 418-425, 428 
as unit of behayior, 222 
Temporal placement, reinforcer, 213-218, 
£29 
T emporal structure: 
drug effects, 562 
induced behavior, 140-146 
verbal behavior, 643-647 
‘Temporalis muscle, 4{"_-A18 
‘Temporospatial factors, and central rein- 
forcement, 575 
Terminal periods, on periodic schedules, 
133, 137-139 
Terminal ratio, definitions, 978-574 
‘Terminal response: 
competition with interim, 1327-133, 139 
definition, 126 
and crrorless learning, 464, 475 
operant view, 127 
Pavlovian view, 127 
relation to reinforcer, 273-274 
schedule-induced, 126, 133, 139, 141 
Terminal state: 
interaction with interim, 139-145 
properties, 137-138 
Termination, meals, 28, 32, 45 
Tetrahydrocannabinol (see also 
juana): 
and acquisition of avoidance, 560 
classification, 545 


o14, 


Mari- 
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Tetrahydrocannabinol (cont.) 
and peck rate in pigeons, 552 
and stimulus control, 555 
tolerance, 553 
Tetrodotoxin, 166 
Thalamus: 
EEG activity, 605 
self-stimulation, 590 
THC (see Tetrahydrocannabinol) 
The Behavior of Organisms, 9, 32 
Thematic classes, verbal responses, 637 
Theoretical response unit, 222-226 
Therapeutic effects, drugs, 540, 542-545, 
563-566 
Thermal gradient, 157 
Thermal preference, 153, 159, 162-164 
Thermal reinforcement, 153 
Thermally sensitive units, 154, 158, 160, 
166, 169 
Thermocline, 157-161 
‘Thermodynamics, 31 
Thermoregulation, 156-169 
Thermoregulation: 
alligators, 153 
amphibians, 156-158 
baboons, 153, 166 
and behavioral homeostasis, 153 
birds, 159-160 
cats, 153, 166-167 
chicks, 153, 159-160 
and diet, 162 
digestion and, 162 
dogs, 153, 159, 166-167 
doves, 153 
drug effects, 153-155, 164-169, 188-189, 
556 


fish, 156-158 
frogs, 157-158 
goldfish, 153, 158 
hamsters, 159 
and hormones, 161, 162 
human, 158, 162-164 
and hypothalamus, 154-156, 164, 167- 
169 
iguanas, 157-158, 166 
infants, 158-160 
lizards, 153, 157-158, 161-164 
macaque, 153, 159 
mammals, 158-159, 162, 167 
monkey, 164, 167 
mouse, 153, 159, 167 
negative feedback in, 153-154 
neural controls in, 153-155, 167-168 
operant, 153-161, 164-169 
pig, 153, 158-159 
and preoptic area, 154-156, 167-169 
rabbit, 159, 166-167 
rats, 153-156, 162-165, 167-168, 188-189, 
556 
reflex, 153-156, 160-161, 165-168 
reptiles, 156-158, 161-162, 166 
respondent, 154-156, 160-161, 166-168 
snake, 162 
squirrel monkey, 41, 153, 156 
Theta waves, 119, 605 
Thigmotaxis, 117 
Thiobarbiturates, 549 
Thioridazine, 545 
Third-order deviations, in fixed-interval 
schedules, 208, 213 
Thirst, and induced drinking, 138-139 
Thorazine (see Chlorpromazine) 
Three-term contingency, and verbal be- 
havior, 633 
Threshold: 
definitions, 520-522 
sensory, in animals, 515-525 
Threshold tracking, animal psychophysics, 
521-523, 525 


Thrush, gape response, 19-20 
Thwarting, 23-24 
Thyroid hormones, and avoidance, 599- 
600 
Thyroidectomy: 
rats, 18 
and thermoregulation, 154 
Tilt (see Line orientation) 
Time: 
allocation, and generalization, 435 
as discriminative stimulus, 217-219 
distribution of responses in, 106-108 
as indicator of value, 101 
matching: 
on concurrent schedules, 
246-248, 254, 272, 274 
in interresponse class, 264 
as stimulus on FI, 218-219 
Time course, drug effects, 548-549 
Time distribution: 
among concurrent schedules, 233, 246- 
247, 272 
among interresponse times, 263-264 
Time out: 
on electric shock schedule, 179, 194-195 
escape from S—, 469-470 
negative reinforcement schedule, 367, 
385-386, 388-390, 398 
punishment in animal psychophysics, 
523 
from reinforcement: 
and alcoholism, 563 
as punisher, 244, 253 
and reinforcement omission on FI, 301 
and schedule-induced behavior, 139 
Time schedule (see also Fixed time, Ran- 
dom time, Variable time): 
definition, 202 
and interval schedules, 214 
Timing, induced behavior sequences, 141- 
143 


238-239, 


Titration method: 
animal psychophysics, 521 
self-stimulation of brain, 574 
T-maze, drug effects, 555 
Tofranil (see Imipramine) 
Token economy, 7 
Token reinforcement, 
reinforcement, 306 
Tolerance: 
drugs, 550, 553 
large fixed ratios, 36, 38-39, 45 
Tone: 
as contingent stimulus, 115 
stimulus in autoshaping, 69, 273 
stimulus in conditioned suppression, 
89, 341-344, 351 
stimulus in contrast, 274-275 
Tone frequency, generalization: 
guinea pig, 441 
pigeons, 440-441, 450, 455-456, 495, 528 
rat, 441 
Tongue flip, frogs, 121 
Tooth chattering, rat, 17 


and conditioned 


‘Topography: 
instrumental vs. contingent response, 
108 
lever contact response, 60, 65, 81-82, 
120 


pecking, on FI and VI schedules, 141 
response: 
in autoshaping, 24, 58-60, 67, 81, 
119-122 
on differentiation schedules, 203 
and matching, 262 
and negative reinforcement, 406-408 
Toxic effects, drugs, 546 
Toxicology, behavioral, 562-563 
Tractus solitarius, 586 


Subject Index 


Traditional pairing hypothesis, condi- 
tioned reinforcement, 314, 318, 325 


Tranquilizer (see also Drugs, specific 
drugs): 
and conditioned suppression, 349, 355- 
357 


punishment effects, 558-559 
and schedule control, 188 
Transfer (see also Generalization, Stim- 
ulus generalization): 
in animal psychophysics, 526-530 
in behavioral sequences, 147-148 
common elements theory, 639 
simultaneous to successive discrimina- 
tion, 487 
stimulus control, and errorless learning, 
465-466, 475 
in verbal behavior, 634, 649 
visceral learning, 608 
Transfer tests: 
conditioned reinforcement, 288-289 
induced states, 138 
Transformational grammar, theory, 629- 
649 
Transient contrast, 77, 268, 275 (see also 
Local contrast) 
Transient generalization, methods, 434 
Transition, of situation, as reinforcer, 381 
Transition performance, chained sched- 
ules, 290-291 
Transition states, and schedule control, 
227-228 
Transitivity, stochastic, and choice, 332- 


Transituational reward, 100-101, 113 
Translation, language, 650 
Transposition, in discrimination learning, 
447, 497 
Treadle pressing, pigeons: 
avoidance, 450, 455-456 
concurrent schedules, 236 
conditioned enhancement, 90 
contrast, 82, 89 
negative reinforcement, 386-387, 391, 
407 
positive conditioned suppression, 359 
relation to reinforcement, 273 
stimulus control, 438, 455-456 
Treadmill, 40 
Trial, duration, in autoshaping, 57-58, 66, 
90 
Trifluoperazine: 
Classification, 545 
and punishment, 558 
Triflupromazine, 545 
Trigeminal nerve, section of, 15 
Trilafon (see Perphenazine) 
True discrimination procedure, 495-505, 
507-509 
Truly random control procedure, 55-56, 
61, 84, 120, 343, 391 
t — r schedules: 


explanation, 203 
and IRT reinforcement, 223 
Turtle, animal psychophysics, 519 
Twittering, chicks, 60 
Two-factor theory, negative reinforce- 
ment, 364-365, 393, 396-398, 400 
T'wo-key concurrent schedule, description, 
234 
Two-process model, conditioning, 99 
‘I'wo-response methods, animal psycho- 
physics, 517-519 
Two-stage model, food motivation, 32 
Two-state analysis, fixed-interval respond- 
ing, 265 


Ulcers, and aversive control, 597, 604-605 


Subject Index 


Uncertainty reduction, and conditioned 
reinforcement, 318-326, 336-337 
Uncertainty reduction hypothesis: 
conditioned reinforcement, 
318, 322-325, 330-337 

quantitative implications, 322-325 
Undermatching, concurrent schedules, 
249-943, 248-255 
Unit, linguistic, 620-621, 629-630 
Unit, response (see Response unit) 
Unit schedule: 
part of chain schedule, 289 
part of second-order schedule, 299-302 
token delivery, 306 
Units of behavior, problem of specifica- 
tion, 29 
Universals, linguistic, 641 
Up-down method, animal psychophysics, 


313-315, 


UR, definition, 340 
Uridine, in mouse brain, 606 
Urination: 
and conditioned suppression, 351 
guinea pigs, 21 
US: 
autoshaping, 55, 58, 61, 69-70, 80 
conditioned suppression, 342-344 
in contrast, 81, 88 
definition, 340 
fear conditioning, 315 
Uulity: 
and matching, 277 
rate increase in FR, 44 


Validity, stimulus, and stimulus control, 
494=496. 500. 506 
Valium (see Diazcpam) 
Valsalva mancuver. 609 
Valuc: 
of events to organism. 101 
intervening variable, in matching ex- 
periments, 276 
scaling, concurrent schedules, 298 
Variability. behavioral sequences, 142, 146 
Variable cycle procedure, shock deletion, 
387 
Variable delay procedure, shock delay, 
3/2=5145 
Variable-interval schedule: 
absolute response rates, 257=265 
and adventitious punishment, 184 
autoshaping, 127 
aversive control, 428 
avoidance, 260, 969-271 
central reinforcement, BY 
concurrent (see Concurrent schedules) 
and conditioned suppression, 89, 341- 
345, 356, 350, 508 
a response experiments, 108- 
and contrast, 71-85 
definition, 202 
electric shock, 178-179, 194 
electrical brain stimulation, 181, 259 
escape, 384, 386 
generalization gradient, 434, 436, 438, 
440, 446, 460, 462 
induced behavior on, 132-135, 141 
induced drinking on, 129, 132, 135 
interim periods on, 133, 135 
local rate of reinforcement, 216 
matching on (see Matching, Concurrent 
schedules) 
negative reinforcement, 384 
observing responses, 319-320, 323-324 
paced, 225 
probability of reinforcement, 214-215 
relation to variable ratio, 210 
response strength, 257-265 


temporal patterning, 213 
types of distribution of intervals, 215 
uncorrelated shock, 350 
Variable-ratio schedule: 
conditioned suppression, 346 
definition, 202 
and generalization gradient, 436, 439 
matching, 246-247, 254 
observing responses, 321 
relation to variable interval, 210 
and sexual behavior in sticklebacks, 12] 
temporal patterning on, 213, 217 
Variable-time schedule: 
concurrent matching, 237-238 
contrast, 76-77, 80-81, 84-85, 90 
definition, 202 
electric shock, 397-398 
induced behavior, 127, 130 
Vasoconstriction, and thermoregulation, 
Vasodilation, and thermoregulation, 154, 
165, 167-168 
Vasomotor responses, and thermoregula- 
tion, 154, 161 7 
Vector propertics, responses, 177 
Ventral noradrenergic neurons, 
tion, 585 
Ventral teomentum, self-stimulation, 590 
Perbal behavior, 5, 628, 635-636, 638, 639, 
642, 649 
Verbal behavior, theory, 628-629, 633-640 
Verbal learning, and language, 037 
Vertical, perception of, in pigeons, 527 
Vesprin (see Triflupromazine) 
Vibratory stimulus, stimulus control by, 


stimula- 


Viseeral-alimen tary duvable 


changes, 397 
Viseeral learning, 607-611 


Visceral earee Oopcrant conditioning, 


system, 


Vision, rats, after lesions, 16 
Visual acuity: 
pigeon, 517-918, 521 
cen lion, 519-520 
Visual discrimination, LSD effects, 55d 
Visual fixation training, in monkey, 523 
Visual stimulus control (see alse Bright- 
ness, Line orientation, Wavelength, 
stimulus contro): 
pigeon, 489, 483, 485, 488 400 
Vitamin A, 10, 22 
Vitamin deficiency, and thermoregulation 
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154 
Vocalization response, 
physics, 519-590 
Voltage reduction, reinforcement, in rats, 
GI 


animal psycho- 


Voluntary behavior, key pecking as, 53-55, 
67 

Voluntary mediators, in instrumental 
autonomic conditioning, 610 


Walden Two, 7 
Warm up effect: 
animal psychophysics, 521 
aversive control, 426 
negative reinforcement, 409 
Warning signal, in avoidance: 
and endocrine changes, 599 
and gastrointestinal changes, 604 
Warning stimulus (see also CS): 
aversiveness, 365 
and avoidance theory, 364-365, 393-398, 
400 
central reinforcement, 579 
shock-delay procedure, 393-398 
Wasp, hunting behavior, 142 
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Water: 
escape, by rats, 261, 263, 282 
interaction with food: 
and induced behavior, 131, 135, 139 
and matching, 252 
reinforcement, and peck duration, 58- 
60, 67, 81, 119 
reinforcer in automaintenance, 65, 67 
reinforcer in autoshaping, 58-60, 62, 81, 
119 
reinforcer in induced drinking, 138-139 
Water balance, in polydipsia, 129 
Water bath, and thermal preference, 163 
Water holes, patterning of visits, 36 
Water intake, and foed intake, 35 
Water restriction, and meal patterning, 35 
Wavelength: 
difference, stimulus control by, 595, 531, 
534 
maintained generalization, pigeons, 530 
stimulus generalization: 
chicks, 486 
ducklings, 485-486 
Japanese quail, 486 
monkeys, 486 
pigeon, 4534, 459, 441-445, 454, 489, 
453-494, 501-508, 531-552 
lee discrimination, pigeon, 519, 
594 


Weak law of effect, LOO-1O1. 115. 112 


Weber fraction, time iscrimuination, 919 


Weight 108s: 
between meals, 99, 96 
and performance, 30, 40 
“Waell-behaved” oparants, 183 
Wheel cranking, collsge students, 103, 107 
Wheel running (coo Running) 
Wheel turning, and negative reinferce- 
ment, 2390-290] 
White light, and wavelength generaliza- 
tion, 486 
Wing flapping: 
agoreccive, in pigeon, 59 
schedulc-induced, in pigeon, 126, 157, 
189 
Wistar rats, 6605 
Withdrawal eyndrome, morphine, in men- 
key, 179, 554 
Within-meal behavisr: 
as locus of analysis, 30, 48 
rate changes, 44 
Words, a5 discriminative stimuli, 619, 621 


Worsening, conditions, and contrast, FE 


A-ray, poisoning by, association with food, 


13_14, 484 


Ves-nie procedure, animal psychophysics, 
518-522, 524 
Yoga, and physiclegical conditioning, 607 
Voked controls: 
avoidance, and peptic ulcer, 604—605 
brain biochemistry, 606 
chained schedules, 296 
conditioned reinforcement, 319 
conditioned suppression, 346, 357 
contingent-response experiments, 
108, 111, 115 
differentiation 


106- 


and  variable-interval, 


fixed-ratio vs. interval, 217 

and matching, 272 

omission training and automaintenance, 
65-66, 128 

VI vs. VR in pigeons, 210 

and stimulus generalization, 436 


Zona incerta, self-stimulation, 590 


